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Abstract 


Twitter has increasingly been used to study various research topics such as 
election predictions, disease spread, etc. However, social media platforms 
do not saturate the entire population in a study area, especially in emerg- 
ing nations, only representing more affluent subpopulations. The U.S. 
Army Engineer Research and Development Center, Construction Engi- 
neering Research Laboratory (ERDC-CERL), as part of a project entitled 
Framework for the Integration of Complex Urban Systems (FICUS), is 
quantifying the utility of demographic information to inform neighbor- 
hood-scale social media models. Using the example topic of infrastructure, 
an open-source model was constructed to collect Twitter data from the 
metropolitan Philippines area of Manila, geotag tweets to neighborhood 
grid cells based on language analysis, and produce a sentiment topic map. 
ERDC’s social media analysis tools incorporate quantifiable uncertainties 
with specific on-the-ground reporting techniques. By using the Humani- 
tarian Crisis (HC) framework developed by PACOM (another FICUS prod- 
uct) as a model, a framework quantifying the likelihood of being a regular 
social media user was created to implement a data-driven, bottom-up 
framework construction nested within a knowledge-based established 
framework. This framework, and any other produced by the FICUS team 
serve as case studies for augmenting the military operational environment 
with quantifiable reduced uncertainties. 


DISCLAIMER: The contents of this report are not to be used for advertising, publication, or promotional purposes. Ci- 
tation of trade names does not constitute an official endorsement or approval of the use of such commercial products. 
All product names and trademarks cited are the property of their respective owners. The findings of this report are not to 
be construed as an official Department of the Army position unless so designated by other authorized documents. 
DESTROY THIS REPORT WHEN NO LONGER NEEDED. DO NOT RETURN IT TO THE ORIGINATOR. 
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1.1 


Introduction 


Background 


The Framework for Integrating the Complexity of Urban Systems (FICUS) 
project was sponsored by the Office of the Assistant Secretary of the Army 
for Acquisition, Logistics, and Technology (ASA(ALT)) for the purpose of 
empowering military planners with new tools for conflating and synthesiz- 
ing data and knowledge that are critical for understanding megacities and 
other dense urban environments (DUEs). The purpose of the FICUS re- 
search effort was to design and develop a computational framework to 
support federated models of complex urban systems and facilitate im- 
proved levels of information support to Joint Intelligence Preparation of 
the Operational Environment (JIPOE) undertakings (Ehlschlaeger et al. 
2018). Understanding DUEs requires integrated, accurate regional- and 
neighborhood-scale conceptual and computational models detailing oper- 
ationally relevant information about the environmental, infrastructural, 
and social systems. 


In developing the overall FICUS capabilities, a subset of ERDC researchers 
demonstrated how broad theory-based frameworks can inform more spe- 
cific risk-evaluation frameworks and vice versa. The ability to link and 
fluctuate between theory and data-driven frameworks enhances the JIPOE 
process by representing the full range of complexity in urban systems. 


One task in the overall development and validation of FICUS was to demon- 
strate how broad, theory-based frameworks can inform and be informed by 
more specific risk-evaluation frameworks. The ability to link and alternate 
between theory- and data-driven frameworks can enhance the JIPOE meth- 
odology by representing greater ranges of complexity in urban systems. In 
previous work, ERDC researchers completed a pilot application of FICUS to 
a theoretical framework for assessment of Humanitarian Crisis (HC) risk 
factors, which was provided by the United States Pacific Command 
(USPACOM) Joint Intelligence Operations Center (JICPAC) (Bastian et al. 
2019). Next, the research team developed a cholera risk analytical frame- 
work that incorporated portions of that theoretical HC risk framework that 
are relevant to cholera outbreaks and epidemics. When the researchers ap- 
plied FICUS methods to large, diverse social science datasets they were able 
to identify previously unavailable quantitative knowledge of significance in 


ERDC/CERL TR-19-14 


mitigating cholera risk and managing responses to outbreaks (Bastian et al. 
2018).This data-driven demonstration also provided results that were used 
to improve the HC framework. 


The present project again applied the FICUS data-conflation model and 
portions of the HC risk framework, which portrays risk in terms of five 
high-level conditions: 


e Natural hazards 

e Human behavior impact 

e Services failure 

e Readiness and response inadequacy 
e Resilience deficiencies. 


The topic of interest for the framework used in this FICUS case study is so- 
cial media—Twitter in particular. This social media application had more 
than 300 million users worldwide as of mid-2017 (Wagner 2017). Twitter 
is similar to a texting application, but users can instantaneously share 
short messages globally to any of a virtually unlimited number of topical 
categories. These categories, known as hashtags, may represent a commu- 
nity of users interested in topics ranging from politics, social issues, or ce- 
lebrity culture to pets and personal hobbies. A hashtag is delimited with an 
initial pound sign (#) followed without a space by a subject term—for ex- 
ample, #spambots or #caturday. Within any hashtag messaging thread, 
participants may reply to any user message posted to Twitter, whether to 
validate someone’s message, rebut it, or begin a longer-form conversation. 
Worldwide, Twitter is enthusiastically used by people from almost all soci- 
oeconomic strata, from young people in the developing world to the most 
powerful political leaders on the planet. 


Twitter has been the topic of several studies related to humanitarian assis- 
tance and disaster relief (HADR) and epidemiology (Szomszor et al. 2010, 
Salathe et al. 2011, MacLean 2015, Meier 2012, Collins 2013, Kumar et al. 
2011, Kumar et al. 2014, Cooper et al. 2015, Cassa et al. 2013, Dredze et al. 
2013, Andrei et al. 2016). Some of this research has addressed topics re- 
flected in other FICUS frameworks. More examples of Twitter research 
topics and analyses can be found in Steiger et al. (2015). The platform’s 
open-source public API makes Twitter a readily available source of data 
for qualitative research that can augment the value of more quantitative 
research in near-real time. In the present FICUS case study, an open- 
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1.2 


1.3 


source Twitter data collection and sentiment analysis model was devel- 
oped and combined with a data-driven social media use computational 
framework in order to (1) better quantify the uncertainties inherent in so- 
cial media data and (2) reduce uncertainties in the model. 


Objective 


The objective of this case study was to demonstrate how broad military op- 
erational frameworks can inform be informed by social media analysis. At 
this phase of study, the goal was to demonstrate and validate the quantita- 
tive use of FICUS with large-scale existing datasets to improve the quality 
of information provided to military decision makers. 


Approach 


Chapter 2 presents a review of the FICUS data-conflation methodology 
and a discussion of previous applications of FICUS to social science frame- 
works. It also explains the significance of social media usage data in the 
Philippines, which is the subject nation for this case study. Chapter 3 de- 
scribes the development of an open-source FICUS Twitter data-conflation 
tool, and Chapter 4 describes development of the social media use data 
framework. Chapter 5 concludes the report with a discussion of the antici- 
pated techniques and applications of the open-source FICUS Twitter tool 
and its complementary social media use framework. The Appendix shows 
the content of the Social Media User framework presented within the 
FICUS data architecture. 
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2.1 


FICUS Demographic Models 


Technology overview 


ERDC-CERL researchers developed a methodology and model for repre- 
senting multifarious sociocultural data layers in a way that end users will 
understand. This technology, called FICUS (Ehlschlaeger et al. 2018), can 
provide end users more informative, data-based insights into complex sit- 
uational issues that affect military planning and operational activities. The 
central capability of FICUS is a data-conflation model that can combine 
massive demographic databases into GIS map layers of social, infrastruc- 
tural, and environmental metrics that can be aligned to mission-specific 
operational use. Unique to the FICUS model is its ability to account for all 
input data error and model uncertainties using statistically rigorous meth- 
ods. The FICUS model requires that source data be constructed using a 
spatial-temporal uncertainty model presenting alternative representations 
of the data layers based on the known errors and uncertainties. 


The model uses data from a subject nation’s census, the U.S. Agency for 
International Development (USAID 2011), and DoD-sponsored surveys. It 
also requires subject matter experts (SMEs) to represent the range of 
framework-weighting factors based on their knowledge of the complete- 
ness of the available data sources for operational needs. Monte Carlo sim- 
ulation is then applied to survey responses as they are applied to the 
framework, creating a range of likely results for each of level of framework 
components (i.e., conditions, factors, indicators). 


The FICUS presentation of a range of prospective results allows decision 
makers to understand the utility of the available data. The results are gen- 
erated in the form of geospatial thematic maps at a resolution of 200 me- 
ters per grid cell. Each grid cell contains the range and distribution of 
possible metric values for the population within 800 meters of that cell’s 
location. Figure 1 shows a representative example of metric outputs from 
the FICUS model produced for previous studies. 
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Figure 1. Estimated average wealth inequity 
between Muslims and Hindus near Dhaka, Bangladesh. 
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Details of the FICUS model are published in (Ehlschlaeger et al. 2016). 
Overall, the technique is composed of six steps, illustrated in Figure 2. The 
first three steps focus on simulating the population within the landscape. 
This phase includes accurately representing population densities and fit- 
ting population demographics within that representation. Key to this is 
understanding the environmental factors that influence the attractiveness 
of a site for a household to locate. The last three steps generate indicator 
maps of the simulated households. These steps focus on retaining the er- 
rors and uncertainties of the input data in a way that enables end users to 
understand the impacts on their application and ultimately improve the 
utility of the information for decision makers. 
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Each step of the FICUS methodology is briefly explained below: 


1. Weight Survey Cases. Survey cases are replicated a number of times to 
match demographic characteristics in the overall estimated population 
enumerations. The replication process fits the results and are weighted us- 
ing asum of least squares minimizing specific desirable survey responses. 

2. Spatial Allocation. Household survey cases are realized into plausible geo- 
graphic locations. Ultimate household location maps are based on house- 
hold density maps, ground truth data, and survey shuffling for 
optimization. The process of maximum entropy analysis generates house- 
hold density maps. 

3. Shuffle Survey Case Location. Survey case locations are shuffled to im- 
prove spatial statistics. This optimizes a set of proportional and spatial sta- 
tistics for each population realization to create realistic clustering of survey 
responses. 

4. Kernel Density Estimation. For each desired combination of survey re- 
sponses, proportion maps are generated on each population realization 
throughout the study area, representing the percentages of simulated sur- 
vey cases with such responses. This kernel density estimation process is 
done cell by cell across a regularized grid. 

5. Generate Survey Response Maps. Map algebra analysis generates survey 
response maps 1—n realizations. 
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6. Calculate Summary Statistics. Throughout the study area, box plot sum- 
mary statistic maps are compiled on the minimum, maximum, median, 
medium, first quartile, and third quartile of realizations at all study area lo- 
cations, as well as the standard deviation and interquartile range for these 
locations. Both the summary statistics and the kernel analysis for each re- 
alization provide error and uncertainty estimates. 

7. Steps 1—4 are repeated for dozens, hundreds, or even thousands of alterna- 
tive realizations in order to create enough realizations to provide repre- 
sentative distributions for important survey answers at critical geographic 
locations. For example, a survey response that is seldom answered by re- 
spondents would require a larger number of realizations for its Poisson 
distribution to reflect the variation of reality while a survey response an- 
swered by about 50% of the households or people would take fewer reali- 
zations to define the resulting normal distribution. 


Initial humanitarian crisis framework 


Figure 3 depicts a theoretical HC framework of conditions, factors, and in- 
dicators developed for a previous FICUS study (Bastian et al. 2109). The 
present case study is informed by this prototype HC framework. Each 
framework category, from the higher to lower levels indicated by the 
framework column headings as read from right to left, measures a differ- 
ent dimension that represents a potential risk or stressor. JICPAC pro- 
vided this prototype theoretical HC framework and chose Bangladesh as 
the subject of a case study to demonstrate the usefulness and functionality 
of a computational framework for better-informed decision making. 


The purpose of the prototype HC framework is to show estimated risk lev- 
els in a way that allows for comparison within and across nations. The HC 
framework, and others developed using the same methods (including 
FICUS data conflation), encompass three categorical levels—conditions, 
factors and indicators. At the highest framework level shown in Figure 3, 
each condition encompasses a broad category of events and circumstances 
that cause or lead to humanitarian crises. The first HC framework condi- 
tion, Natural Hazards, encompasses crisis events that often occur within a 
short temporal range but seriously affect large populations and produce 
ramifications for years to come. 
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Figure 3. Humanitarian Crisis framework. 
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The second HC condition, Human Behavioral Impact, addresses negative 
human impacts on the natural environment, people, and infrastruc- 
ture/built environment that can potentially increase risk of a crisis. Service 
Failure, the third condition, focuses on the failure of organized entities 
(mostly public, but private ones may be included) to provide basic services 
to citizens. These services include factors such as the rule of law, polic- 
ing/justice, health and medical care, and public utilities. The last two condi- 
tions—Readiness and Response Inadequacies, and Resilience Deficiencies— 
primarily account for systemic factors related to the nation’s ability to ac- 
commodate, respond, and recover from humanitarian crises and disasters. 


Social media use in the Philippines 


Filipinos are some of the world’s most avid social media users, making the 
Philippines an excellent location for Twitter analysis. Currently, 58% of the 
population are monthly active social media users (Third Team Media 2017). 
Filipinos spend more time (4.17 hours per day) on social media than any 
other country in the world (Third Team Media 2017). While Facebook is the 
most popular social media platform in the Philippines with over 1.87 billion 
active users, Twitter is still in the top 10 with over 317 million active users 
(Third Team Media 2017). Filipino Twitter users have been shown to more 
actively engage with the platform versus other global users, with 69% of us- 
ers engaging daily versus the global average of 33% (Branding in Asia 2016). 
A heat map showing the percentage of the most recent million tweets in the 
Philippines shows that while tweet activity is clustered in the megacity of 
Manila, it is also widespread throughout the islands (Figure 4). 


Given that social media has broadly penetrated the Philippines market, 
and that the use of social media has been proven to be effective in HADR 
crises, the research team sought to combine Twitter data with a computa- 
tion framework specific to social media use. In this way, the uncertainty of 
tweet locations can be better calibrated against the framework model, and 
the result can potentially augment a broader military operational frame- 
work such as the HC framework. 
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Figure 4. Heat map of twitter activity in the Philippines, 
21 November 2017. 
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Developing the FICUS Twitter Tool 


The FICUS open-source Twitter tool consists of a series of R scripts inte- 
grated into the Open Modeling System (OMS) backend. These scripts can 
be accessed and run from any location with an internet connection and ac- 
cess to the cloud repository. The goal of the tool is for users to identify a 
region of interest, a “bag of words,” and a time period (optional) with 
which to query Twitter, with the result being an interactive map printed in 
the browser. The map displays tweets color-coded by sentiment and aver- 
aged over a neighborhood grid cell (Figure 5). 


For the first iteration of the FICUS Twitter tool, the region of interest was 
the City of Manila and the surrounding metro area. The “bag of words” 
was a list of 204 words compiled by the authors related to transportation 
and infrastructure in Manila, in both English and Filipino. 


The tool can be broken down into four main components, depending on 
what kind of research and analysis a user is looking to complete. There 
four components are: Collection, Geotagging, Sentiment Analysis, and Vis- 
ualization. Each component was built using the R language. They can be 
used as standalone tools or combined to run as a whole (the default for the 
FICUS Twitter tool). 


Figure 5. Example of the Interactive Map produced by the FICUS Twitter Tool. 
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3.1 


The Collection component allows users to collect and store tweets based 
on a defined geography, bag of words, and even a time period within the 
last 2 weeks. As the Collection component uses the Twitter Search API and 
not the Streaming API, it goes back in time from when the query is submit- 
ted to return results. 


The Geotagging component attempts to match a percentage of tweets 
without geographic latitude and longitude coordinates to neighborhood 
areas (typically 1 km x 1 km grid squares) based on the language of the 
tweets. The methodology and logic for this method is based on Par- 
askevopoulos et al. (2016). 


The Sentiment Analysis component assigns a numeric sentiment score to 
each tweet, where a positive number denotes a positive sentiment and a 
negative number denotes a negative sentiment. Numbers are calculated 
based on the number of matches between dictionaries of positive and neg- 
ative words (compiled by Liu et al. 2005 and translated into both English 
and Filipino) and the tweet text. 


The Visualization component combines the visualization created by the 
grid cells color-coded to sentiment with other visualizations produced 
within the FICUS open-source OMS environment. The tweets can also be 
visualized on a desktop using both Leaflet and R/R Studio. 


The following sections detail each component methodology and its logic. 


Collection 


The Collection component uses the public Twitter Standard Search API, 
which supports up to 7 days of tweet history from the point in time of the 
query (Twitter 2017). The public Search API allows free, rate-limited ac- 
cess to an average of 1-2% of the total tweets available; although limiting 
the query with a specific search term(s), geographic bounding box, etc., 
would theoretically increase the percentage of total tweets returned. Users 
can also access as sample of tweets in real time using the publicly available 
Twitter Streaming API. Other higher fidelity and reliable access to other 
Twitter streams are available for purchase from Twitter. 


With an authorized connection to the Twitter public Search API, the Col- 
lection component will return the first 100 tweets per term in the “bag of 
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3.2 


words” for the pre-defined geographic boundary. The “bag of words” con- 
sists of a.csv file in which each word has its own row. The boundary is a 
circle identified by a center coordinate (latitude, longitude) and a radius 
distance. Given a shapefile (or other spatial dataset) of a city, state, or 
other geographic boundary, a GIS software (such as R or ESRI’s ArcGIS) 
can be used to create a circular bounding geometry to determine the ap- 
propriate center coordinate and radius length. In the case of the FICUS 
Twitter tool prototype, the bag of words was a list of 203 words related to 
infrastructure in both English and Filipino (translated using Google Trans- 
late), ranging from the generic (i.e., “water,” “transportation”) to the spe- 
cific names of major roads, transportation modes, and waterways in 
Manila (1.e., “Ninoy Aquino,” “jeepney”). The bounding geometry was 
based on the City of Manila. 


Tweet metadata that are downloaded for each tweet can include: tweet 
text, screenname, time and date stamp, tweet ID, favorite count, if the 
tweet is a retweet, if the tweet was a reply (and the screenname of the orig- 
inal tweet which is being replied to), tweet source, and latitude and longi- 
tude coordinates. The FICUS Twitter tool also adds a column with a time 
and date stamp for when the tweet was collected (i.e., the time of the 
query) and removes duplicate tweets based on the ID. 


The Collection component stores all tweets in a table format within an 
SQLite database, and also exports them as a.csv file. 


Geotagging based on language modeling 


Geotagging tweets that are lacking geographic coordinates has been the 
subject of several scholarly studies (Chandra et al. 2011, Hironaka et al. 
2016, Mahmud et al. 2012, Hecht et al. 2011, Wing et al. 2011, Schulz et al. 
2013, Roller et al. 2012, Compton et al. 2015, Poulston et al. 2017). These 
methods range from using screennames of followers to in-text interactions 
between users (“mentions”), language analysis, profile information analy- 
sis, and external text corpi to identify the location from which the tweet 
was generated from and/or the home location of the user. Steiger et al. 
(2015), Table 2 provides a detailed review and study overview of papers 
conducting spatiotemporal Twitter analyzes (p. 14). 


Despite the deluge of research that has been conducted in the past decade 
on geotagging tweets, no one methodology has proven to be better than the 
others. Obtaining the highest possible percentage of successfully geotagged 
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tweets depends on the number of tweets in the dataset, the bounding geog- 
raphy, and desired granularity of geocoding (i.e., country, state, city, etc.). 
For HADR and other event-related efforts to use social media for crisis 
management, finer-grained geolocation techniques (i.e., below the city 
level) are not only more desirable statistically but can be vital to accom- 
plishing operational mission goals. Ozdikis et al. (2017) published a review 
studies focused on of targeted event types and granularity of estimated loca- 
tions, the majority of which are below the city level (Table 1, p. 299). 


In general, research has shown that language (text) analysis of tweets, par- 
ticularly when combined with other analysis geotagging methods, pro- 
duces the highest percentage of accurately geotagged tweets at below a 
city-level scale (Flatow et al. 2015, Kinsella et al. 2011, Sadilek et al. 2012, 
Paraskevopoulos et al. 2016). As the methodology presented by Par- 
askevopoulos et al. (2016) was one of the more recently published meth- 
ods for geolocalizing non-geotagged tweets, and was shown to be an 
effective geotagging approach for smaller spatial geographies and smaller 
datasets of tweets, it was chosen by the authors to be replicated and modi- 
fied using open-source R code. This methodology analyzes the text of ge- 
otagged tweets clustered into grid cells based on the concordance and 
significance of each word, then compares these numbers to those of non- 
geotagged tweets to find a match. As such, SMEs knowledge of non-Eng- 
lish language was not required. 


3.2.1 Geotagging methodology functions 


There are three main functions completed by this portion of the FICUS 
Twitter tool throughout the geotagging process. They include, in order of 
operation: assigning geotagged tweets from the raw tweet dataset to a 
specified neighborhood grid cell, calculating concordance and significance, 
and calculating similarity between neighborhood grid cells and non-ge- 
otagged tweets. All algorithms which perform these functions and were 
replicated in the FICUS Twitter tool R code can be found in Paraskevopou- 
los et al. (2016). 


3.2.1.1 Neighborhood grid cell assignment 


As in Paraskevopoulos et al. (2016), the FICUS team used 1 km x 1 km grid 
cells as pseudo “neighborhoods.” Larger grid cells or other neighborhood 
spatial boundaries (i.e., census sub-geographies, zip codes, etc.) could eas- 
ily be substituted in place of the 1 km? grid. 
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To geolocalize non-geotagged tweets, geotagged tweets must be assigned 
to a neighborhood grid cell based on if their coordinates fall between the 
grid cell extents. The number of the grid cell becomes the geotagged tweets 
“Area ID.” On average, approximately 1-2% of all tweets in a raw dataset 
were geotagged and assigned an Area ID. 


While it has proven incredibly difficult to find any resources published by 
Twitter which quantify the percentage of geocoded tweets received as a 
query result compared to the actual percentage of total geocoded tweets 
produced on the platform, work by Morstatter et al. (2013) found that the 
public API covered over 90% of the geocoded tweets available from the 
Twitter Firehose (a paid version of the API which allows access to nearly 
all tweets produced on the platform). Ergo, it is safe to assume that all geo- 
coded tweets returned for a query on the public Twitter API are a robust 
representation of the geocoded tweets as a whole for that area. 


3.2.1.2 Concordance and significance 


In-text/language analysis, concordance is the number of appearances of 
each keyword (i.e., frequency), and significance is a weight based on the 
importance of each word in relation to other words in the text corpus. Con- 
cordance was calculated for both (1) the text corpus for all geotagged 
tweets in each grid cell, and (2) the text corpus of each non-geotagged 
tweet. Significance was calculated across all tweets, both geotagged and 
non-geotagged. By giving each keyword a weight (equal to concordance di- 
vided by total words multiplied by significance) (1) per grid cell and (2) per 
non-geotagged tweet, similarity can then be computed to geolocalize the 
non-geotagged tweets. Low weights (i.e., below 0.1) can also be filtered out 
to further refine results. 


3.2.1.3 Similarity computation 


To calculate the similarity, the magnitude (Euclidean norm) is extracted 
(4) for all grid cells and (2) for all non-geotagged tweets. The similarity 
computation function iterates through all non-geotagged tweets and per- 
forms the following steps: 


1. For each non-geotagged tweet, find grid cell areas which contain the same 
terms within the tweet text. 
2. Ifthere is a match, multiply the weights of the keywords together. 
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3. Sum all weight products for all keywords that appear in both the non-ge- 
otagged tweet and in the grid cell area tweets. 

4. Divide this sum by the product of the grid cell magnitude and the non-ge- 
otagged tweets magnitude. 


The result is a similarity number for each non-geotagged tweet for each 
potential grid cell location. A probability distribution is then created for 
each non-geotagged tweet by dividing the similarity result by the total sum 
of all similarities, which normalizes the results and allows for further un- 
certainty quantification and distribution. Currently, the FICUS Twitter 
tool selects the grid cell location candidate with the highest probability for 
each non-geotagged tweet as the tweet location. 


3.2.2 Model validity testing 


In order to test the validity and accuracy of the geotagging methodology 
used in the FICUS Twitter tool, a test code was written to iterate over all 
geotagged tweets in a raw Twitter dataset. Using Monte Carlo simulation 
and permutation, a percentage of tweets was randomly selected to act as 
“gold standard” geotagged tweets, while the rest acted as a sample dataset. 
The model was tested on two different tweet datasets related to infrastruc- 
ture in the City of Manila: one from April 2017, and one from October 
2017. The results can be seen in Tables 1 and 2, respectively. Table 3 con- 
tains the results from the same model run for the metropolitan Manila ge- 
ography in October 2017 


As the model iterated over the dataset 25 times at each set percentage for 
“gold standard” geotagged tweets, the tables display an average of the 25 
runs per percentage. Distance error is calculated as a straight-line distance 
between grid cell centroids. 


As the April 2017 dataset has the highest number of geotagged tweets per 
dataset (Table 2), this could be the cause behind the higher percentages of 
any results and correct results returned. 


Table 1. Geotagging Results for Infrastructure Tweets in Manila, April 2017. 


Any Result Correct Result Distance Error (km) 


Percent Geotagged 
Tweets* Count Percent Count Percent Min Mean Max 


1% 4374 95.40% 1844 40.23% 
2% 4330 95.45% 1299 28.63% 
5% 4201 95.52% 830 18.88% 
*Total tweets in dataset (n) = 4629 
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Table 2. Geotagging results for infrastructure tweets in Manila, October 2017. 


Any Result Correct Result Distance Error (km) 


Percent Geotagged 
Tweets* Count Percent Count Percent Min Mean Max 


1% 


2% 


5% 


10% 


*Total tweets in dataset (n) = 707 


Table 3. Geotagging results for infrastructure tweets in Metro Manila, October 2017. 


Percent Geotagged Any Result Correct Result Distance Error (km) 


Tweets* Count Percent Count Percent Min Mean Max 


5% 13.49% 


10% 12.28% 


*Total tweets in dataset (n) = 2261 


Despite the small number of tweets in the Manila October 2017 dataset 
(Table 3), the model returned the correct result (i.e., successfully matched 
the tweet to the correct grid cell) an average of 22% of the time, given the 
typical number of 1-2% of geotagged tweets as a “gold standard” dataset. 


Due to the larger geographic area for the tweets tested (listed in Table 3), 
the maximum distance error is much higher than those listed in Table 1 or 
2. However, it should be noted that the mean distance error is only about 
2-3 kilometers more than the tweets within the City of Manila proper. In 
the same vein, the percent of correctly geolocated tweets listed in Table 3 
is only about 4% less than those listed in Table 2. Thus, while accuracy re- 
duces slightly when the spatial boundaries of analysis are expanded, it is 
not reduced to the point where the model is invalid. 


Based on the results of validity testing of the FICUS Twitter tool geotag- 
ging model, the research team can confidently state that 


e the geotagging model can assign approximately 65%-70% of non-ge- 
otagged tweets to a 1 km2area grid cell within approximately 4.92 km 
of the actual location 

e of the tweets assigned to a grid cell, approximately 20.15% of those 
tweets will reflect the actual tweet location 
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3.3 


3.4 


e asn (the number of tweets in the dataset) increases, the percentages of 
both (1) tweets assigned to any area grid cell and (2) tweets assigned to 
the correct area grid cell, will increase. 


Sentiment analysis 


The Sentiment Analysis component calculates a numeric sentiment score 
for each tweet. A positive number correlates with a positive sentiment, and 
a negative number with a negative sentiment. Zero is neutral. 


Using a positive and negative dictionary of words compiled by Liu et al. 
(2005) and translated into both English and Filipino (using Google Trans- 
late), this component iterates through each geotagged tweet searching for 
matches to both dictionaries in the tweet text. Each match to a positive 
word is given 1 point, while each match to a negative word is given -1 
point. The final sentiment score is a sum of all points for the total tweet 
text. Figures 6 and 7, respectively, show examples of positively and nega- 
tively scored tweets from the Manila October 2017 dataset. 


Visualization 


The Visualization component takes the results of the previous three com- 
ponents and displays them in an interactive browser (Figure 2, p 6). The 
tweets can be displayed within the open-source OMS environment or visu- 
alized on a desktop using both Leaflet and R/R Studio. 


Grid cells are color-coded according to the mean sentiment score of all ge- 
otagged tweets within the cell. These grid cells can be downloaded as a 
shapefile for further manipulation on a desktop. 
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Figure 6. Positive tweet example, Manila, Oct 2017 


a Bacon ( Follow oy 
" @yeheybacon - A 


The new traffic scheme for jeepneys sa Sta. 
Lucia area HELPED A LOOOOOT with the 
traffic in Marcos Hway. LIKE A LOT 


8:18 AM - 24 Oct 2017 


5 Retweets 11 Likes & @ a a 
- 1 . 


/ 11 


( Follow ) v 


-LTFRB : CHER Bus transport (TYW 504) falls 
from Alabang skyway ramp due to loss of 
break. @)\V\\Vi\DA says 30 patients suffered 
minor injuries 

Oct 2017 from Manila City, National Capital Region 
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4 Developing the Social Media Use 
Framework 


The HC framework was a theory or knowledge-driven framework; essen- 
tially, the framework components (conditions, factors, and indicators) were 
identified and developed based on established literature. Researchers then 
had to attempt to match the available data sources to the indicator require- 
ments. For example, the indicator “Lack of Communications Availability” 
called for metrics reflecting the percentage of communication towers, DSL 
and fiber optic lines disabled by crisis; percentage of population with access 
to communication (radio, television, cell phones, internet, newspapers) 
(compared with historical data); and satisfaction with access to communica- 
tion. Some of these data were accounted for within the national census and 
survey data (as detailed in section 2.1), but other data gaps remained. These 
data gaps are reflected in the higher uncertainty values for the indicator, 
which account for unknown components (section 3.4). 


The social media use framework (Figure 8) was created in order to augment 
the HC framework with a more specific risk assessment and qualitative input 
data, and to demonstrate methods for reducing uncertainties and unknowns 
in computational frameworks. Researchers used a combined approach of 
knowledge-driven and data-driven methodologies to build the social media 
use framework. First, a literature review was conducted to identify indicators 
of social media use. Survey and census data sources were examined to see if 
they could inform new indicators (a data-driven approach). Individual data 
metrics were then grouped into indicators based on their similarity in ad- 
dressing likelihood of social media use. In some cases, these new indicators 
were incorporated into HC framework factors. As researchers defined these 
indicators themselves, the uncertainty was automatically reduced. 


Although it was informed by the HC framework in its inception, the social 
media use framework has the potential to inform the HC framework in 
turn. For example, the social media framework could be added as an addi- 
tional metric within the “Delays to Re-establishment of Transportation 
Systems” indicator in the HC framework to better understand how opin- 
ions expressed on Twitter can complement infrastructure data to paint a 
broader picture of transportation efficiency following a crisis. Depending 
on the level of influence the user of the HC framework wants social media 
use/Twitter to have, the social media use framework could even be in- 
serted at the factor or condition level. 
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4.1 


4.2 


Figure 8. The social media use framework. 


Conditions _|Factors Indicators 
Sex & Gender 
Individual Characteristics ‘Age 
Disability 
Language Abilities 


Population 


Education Literacy 
Educational Attainment 


Phone Access 


Device Access 
Technology Computer Access 


Social Media Use 


Network Access Internet Access 


Electricity 
Wealth Index 
Road Network 
Urban Proximity 


Household Characteristics 


Place 


Urban Characteristics 


Input data 


The data currently informing the social media use assessment framework 
consist of national census microdata on the Philippines and the USAID 
Demographic and Health Survey (USAID DHS), as also mentioned in sec- 
tion 2.1. Figure 9 displays the conditions (teal), factors (purple), and indi- 
cators of the social media use framework, as well as the data informing 
each indicator. Indicators informed by only the USAID DHS results are in 
yellow. Indicators informed by both the USAID DHS and the census mi- 
crodata are in bright green. Any indicators informed by external sources 
(blue) resulted in randomly generated maps that can be updated through 
the incorporation of non-survey data into FICUS. 


Unlike the HC framework, nearly all of the data inputs for the social media 
use framework reflected quantitative, not qualitative, measures and met- 
rics. For example, metrics for the indicator “Electricity” would measure if a 
household has access to electricity or not, rather than if they would con- 
sider access to electricity to be a big issue for their household. 


Risk value 


Each metric variable must be normalized for apples-to-apples comparisons. 
User assigns a range of values between 0 and 1 to all possible survey ques- 
tion responses representing the possible extent of risk contribution. Thresh- 
old values are shown in the Figure 10. A value of “o” equates to total chaos, 
while a value of “1” equates to no perceived risk. In the case of the social me- 
dia use framework, a value of “o” equates to not at all likely to social media, 
while a value of “1” equates to near certain use of social media. 
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Figure 9. Alignment of survey data to the social media use framework 


Condition: 
Population 
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Figure 10. Normalizing survey responses. Each question response is normalized based on a 
range of values between O and 1 representing the possible extend of risk contribution. 


Color Metric Value Description 
i Minimal risk. Has no impact on risk. 
0.75 Minimal risk. If the weights of all metrics with a .75 value sum to 2.0 or greater, 
indicator will be slight risk or worse even if all other metrics have a 1.0 value. 
500...1 Minimal risk. If the sum of all weights with a metric value of .500...1 is 1.0, any 
other metric has a value < 1.0, indicator will be slight risk or worse. 
05 Slight risk. If the weights of all metrics with a .5 value sum to 2.0 or greater, 
indicator will be slight risk or worse even if all other metrics have a 1.0 value. 
Slight risk. If the sum of all weights with a metric value of .500...1 is 1.0, any other 
0.2500...1 : ai ‘ : : 
metric has a value < 1.0, indicator will be medium risk or worse. 
0.25 Medium risk 
-12500...1 Medium risk. 
0.125 High risk 
0.0625 Extreme risk 


If the level of risk (or social media use) is uncertain, a range (min and max) of 
values may be selected. The wider the range, the less certain of the risk contri- 
bution. The tighter the range, the more certain of the risk contribution. 
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4.3 


4.4 


4.5 


Weights 


Screening requires the evaluation of a combination of indicators. Multiple 
indicators are often aggregated into a factor or condition, usually for com- 
parison across locations or to indicate change over time. Weights are as- 
signed to metrics, indicators, factors, and conditions—allowing each 
component level to be rolled-up to the next. Components are weighted 
against the other components within that level (i.e., all the metrics in one 
indicator, all the indicators in one factor, etc.). Users define weights as a 
numerical value between 0 and 1 based on contribution to risk, o being the 
least important/constraining on the next level up, and 1 being the most 
important/constraining. Again, weights can be defined in terms of a range 
given an uncertain risk contribution level. 


If all weights within a grouping add up to 1, then each unit contributes to 
the accumulation of risk. If anyone (or more) weights within a grouping 
equal 1, then it (or they) drive the overall risk. 


Unknown components 


We know that this is not a complete list of metrics, indicator, factors, or 
conditions. There are potentially other variables that were not considered. 
In this situation, users can opt to add an unknown component. This in- 
serts a random value that accounts for additional variables. It is given a 
weight and treated the same as any other component. 


Roll-up computation 
4.5.1 Favorability functions 


Favorability functions calculate the overall component values using the risk 
value and the weight. Favorability functions were originally known as sieve 
mapping (McHarg 1969) and also called map overlays. Before map overlay 
existed as a computer algorithm, clear acetate maps were inked at locations 
least favorable to an activity or land use. Stacking the acetate maps on top of 
each would provide a visual method to assessment each location’s suitability 
or risk. Bonham-Carter (1995) described digital map overlay as favorability 
functions, which includes easier weighting of individual maps and exact 
measures of suitability or risk. Traditionally, there has been two general 
types of favorability functions used, additive based favorability (equation 1) 
known as weighted linear combination (WLC) (Malcwewski 2000), or con- 
straint-based favorability function (CBFF) (equation 2). 
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My = VWiza ey, (2) 


where 


M is a map with values between 0.0 and 1.0, 

My is the resulting indicator or risk assessment map, 
Mi is the 74 of n criteria maps, and 

wi is the i weight to its criteria map. 


WLC based risk assessments have the problem when a large risk becomes 
masked by other no/low risks at the same location. For example, a location 
where gang violence creates a high risk should have high risk even though 
other criteria, such as well funded after school programs, indicates a low 
risk. On the other hand, CBFF risk assessment does not allow for groups of 
criteria to positively reinforce each other—every criterion constrains the 
indicator to the value of criteria and no higher. 


An ideal favorability function will allow both additive and constraint-based 
characteristics to be declared at both within criteria map locations and the 
criteria map weighting. Equation 3, the Power-Based Favorability Func- 
tion (PBFF), achieves this goal: 

M, = iat M;,. 


ir’ 


where 


M is a map with values between 0.0 and 1.0, 
M. is the rt realization map of indicator values, 
M; is the i of n criteria maps, and 

wi is the it weight to its criteria map. 


While PBFF’s equation (3) is less intuitive than WLC and CBFF, having 
criteria variables be the constant of a power function allows it to be equiv- 
alent to WLC when the criteria weights sum to 1.0. Also, PBFF allows crite- 
ria weights to act influence the indicator like both WLC and CBFF when 
criteria weights sum to be greater than 1.0. For example, if a criterion has 
a weight of 1.0, its indicator will have the same or lower risk value, just as 
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in CBFF. Criteria with weight less than 1.0 in PBFF will allow the indicator 
to have a higher value when other criteria are positive. Thanks to the prod- 
uct function in PBFF, criteria map locations can also be constraints when 
those locations are given a value of 0.0 or other extremely low values. 


Another benefit to using PBFF, similar to CBFF’s, is in the calibration pro- 
cess. Risk model developers can adjust individual criteria map weights or 
the values within the criteria map locations to calibrate to the desired indi- 
cator values without having to adjust other criteria map weights. Traditional 
risk assessment techniques require carefully choosing the indicator’s criteria 
weights, adjusting all of them to whenever calibration is done. PBFF pro- 
vides for an opportunity for non-linear optimization algorithms, such as 
neural nets or genetic algorithms, to create criteria map weights as well as 
the function variables that minimize the errors to known indicator values. 


4.5.2 Quantifying errors and uncertainty 


A stated goal of the FICUS effort was to explicitly represent errors and un- 
certainties within all products. For the PBFF to specifically quantify uncer- 
tainty, equation 4 becomes the Uncertainty Quantified (UQ) PBFF. 

M, = My x TTL, M;, 1.0 < w, + Yi w; (4) 


ir’ 
where 


M.,, is the r* realization map of simulated uncertainty for an 
indicator, and 
wu is the weight of that uncertainty. 


The simulated uncertainty map should be a random field of values be- 
tween 0.0 and 1.0 with a histogram like the distribution of values within 
the criteria maps. The random field should have spatial autocorrelation to 
the largest spatial dependence of the criteria maps. For example, if the cri- 
teria maps used kernel analysis on demographic factors, the random field 
should have positive spatial autocorrelation equal to the kernel analysis di- 
ameter. While this research used the random field described in Ehlschlae- 
ger (2002), there are many theoretical random field models to choose 
from, for example GSLIB (Deutsch & Journel 1992). And while equation 4 
explicitly represents the known uncertainties in the modeling process, the 
modelers were expected to represent the unknown uncertainties as well. 
With regard to UQ PBFF, modelers were expected to estimate the range of 
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values for all weights, w. and wi, that might exist accounting for the lack of 
perfect understanding between the criteria and the indicator. We asked the 
modelers to imagine which criteria they wish existed that would better ex- 
plain the indicator. Then, modelers were to estimate which of those una- 
vailable criteria had the least correlation with available criteria. 
Uncorrelated unavailable criteria would be indicated by higher values and 
greater ranges of the uncertainty weight wu. This uncertainty weight has 
the same behavior on the risk assessment model as the criteria weights. 
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Augmenting Twitter Sentiment Maps with 
Social Media Framework Maps 


This section describes how the results of the FICUS Twitter Tool (Section 
2) and the Social Media Use Framework (Section 3) can be used in tan- 
dem. Uncertainty is inherent in both the geotagged, sentiment-coded 
tweet results and in the social media use framework results. However, 
through the combination of these results using raster calculation and map 
algebra, overall uncertainty can be reduced, and more informed data- 
driven decisions can be made within the operational environment. 


The results of the social media use framework quantify, on a 200 m x 200 
m neighborhood level, the likelihood of people living in that grid cell using 
Twitter and other forms of social media. By overlapping the tweet senti- 
ment grid with the framework results, areas with strong sentiments (either 
positive or negative) can potentially be amplified or tempered based on 
whether or not there is a high likelihood of social media users within that 
grid cell. In this way, a methodology is provided that can help prevent cer- 
tain tweet/Twitter users displaying strong sentiment from skewing the 
grid map (Figure 11). 


The augmentation process was completed in three steps. First, the senti- 
ment grid map was projected and rasterized with 200 m x 200 m grid cell 
size (the same as the framework map results). Second, a new raster was 
created by reclassifying the framework raster based on the value of the 
framework map (Table 4). Using power (square) numbers (as is done with 
the framework components and metric metadata weighting scale), cells 
with values closer to 1 are given a higher number in order to better amplify 
sentiment scores in areas with a greater likelihood of frequent social media 
usage. Third, a raster calculator is used to multiply the reclassified frame- 
work raster and the tweet sentiment raster to produce an augmented sen- 
timent raster map (Figure 8). 


As seen in Figure 8, the augmented map filters out several of the stronger 
sentiment grid cells, with two adjacent cells showing the strongest negative 
and positive sentiments. Logically, the area correlating with these grid cells 
would be the first area of interest (AOD) for warfighters in the operational en- 
vironment. The augmented maps allow for more informed course of action 
planning, and for making better data-driven decisions with less uncertainty. 
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Figure 11. Twitter sentiment map augmented by social media framework. The top left map shows 
sentiment-scored tweets; the bottom left shows the results of the social media use likelihood 
framework; and the right shows the results of processing both maps through map algebra. 
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Conclusion 


The social media use framework, augmented with the FICUS Twitter tool, 
was created with the goal of demonstrating the ease of using a more gen- 
eral, knowledge-driven risk assessment framework (in this case, one which 
focuses on humanitarian crises) to inform a more specific, data-driven 
framework. Both the HC and social media use frameworks augment each 
other; and when used together, they can reduce overall uncertainties and 
increase overall awareness within the operational environment. Uncer- 
tainties are inherent, but rarely acknowledged, in both social media mod- 
eling and computational frameworks. This paper seeks to address both. 


Implementing quantitative measures for the frameworks enables the abil- 
ity to more easily compare changes as new data are made available. Fur- 
ther, it offers the ability to trace backward from the factors down to the 
metrics, so one can explore the impact of changing the weights of different 
components at the higher level, allowing accurate calibration of the ana- 
lytic framework. Finally, the explicit accounting for uncertainty at each 
level allows analysts to more faithfully represent their understanding of 
the framework values, particularly when SMEs are not available. This re- 
duces the uncertainty of those judgments. 


There are likely to be gaps in the framework that are either not obvious or 
obscured by other framework components. Finding these gaps is vital to 
ensure the highest accuracy possible. This is a point where the hybrid ap- 
proach, the incorporation of existing data and theoretical methods, can 
identify and address critical gaps in the framework. Both data availability 
and theoretical methods inform framework development in distinct ways. 
The geospatial risk maps provide intuitive methods for calibration and val- 
idation via qualitative techniques. When framework map errors are identi- 
fied, there is an explicit connection to all modeling decisions and data 
streams to determine whether there is a logical flaw in the framework 
model or calibration is necessary to improve the analytic framework. 


There are inherent issues with social media (and other big data) modeling 
that often go undiscussed (Ruths and Pfeffer 2014, Shelton 2017). Accord- 
ing to Ruths and Pfeffer (2014), these include: the failure of APIs to accu- 
rately represent an entire platform’s data (and the methodology behind 
choosing what results are returned being hidden); platform-specific behav- 
ioral norms; platform user bias; no canonical datasets available to compare 
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(in a statistically sound manner) methodologies; proxy population mis- 
match, etc. Attempting to geolocate tweets without any spatial data also falls 
under these failures. It is worth noting again that there is no research-based 
best practice for geotagging and analyzing Twitter data. Larger datasets, 
larger geographies, and other factors could prove less successful with the 
methodology outlined here, and more successful with other methodologies 
from other research studies. However, by combining a geotagging via lan- 
guage modeling analysis with a computational framework, some of these 
failures and uncertainties are addressed and explicitly quantified. 


In addition, there are other methods for reducing uncertainty overall. For 
example, instead of using grid cells of 1 km x 1 km, users of the FICUS 
Twitter tool could create neighborhood clusters similar to Flatow et al. 
(2015). One could also filter out time series/dates to allow for more event- 
related geolocation and Twitter analysis. Employing a SME who is familiar 
with the language and dialects of the AOI, as well as the geography, could 
greatly increase accuracy, both in creating the “bag of words” and in 
providing sentiment analysis. With their help, a training component could 
be added to the model. Accuracy can even be increased within the visuali- 
zation component, as seen in Figures 12 and 13. Shelton et al. (2015)’s rec- 
ommendation to limit tweets in a dataset to five per user slightly alters the 
mean sentiment scores of the grid cells, filtering out the possibility of over- 
active Twitter users to manipulate the dataset one way or the other. 


The HC and Social Media Use framework exercise, along with the FICUS 
Twitter tool, serves as a good starting point for linking indicators to higher 
level planning objectives. With the addition of spatially representing quan- 
titative metrics, incorporating uncertainty, and weighting the importance 
of individual components, analysts have the ability to more accurately and 
precisely communicate knowledge of the operating environment. This, in 
turn, provides a genuine pathway for the data-to-decisions paradigm. 
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Figure 12. Manila infrastructure tweets-all, October 2017, 
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Appendix A: Social Media Use (Twitter) 


A.1 


Framework Technical Documentation 


Social media user framework overview 


The Social Media User framework is a computational framework that 
seeks to identify likely regular users of social media in the Philippines 
(with particular emphasis on the metropolitan Manila area). While this 
framework is structured and evaluated in a way that pinpoints social me- 
dia use in general (i.e., it is not platform specific), the goal of this frame- 
work is to augment an open-source Twitter model to reduce uncertainty of 
both social mediums posting content, location, and sentiment. 


This framework, in its current state, consists mostly of metrics from pub- 
licly available international surveys. The conditions, factors, and indicator 
descriptions are inclusive enough where other data sources can augment 
the framework component in a way that can further reduce the uncertainty 
across all levels. 


Conditions within this framework include Population, Technology, and Place. 


A.1.1 Condition weighting logic 


All three conditions within the Social Media Framework are given both a 
minimum and a maximum value of 0.333. This is because none of the con- 
ditions will, by themselves, determine the framework’s risk value. 


Condition (Heading Number) Weight Minimum Value Weight Maximum Value 
Population (A2) 0.333 0.333 
Technology (A3) 0.333 0.333 


Place (A4) 0.333 0.333 
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A.1.2 Example outcome 


Social Media Framework 


Social Media Use Likelihood - Mean Standard Deviation 
High ° 4 Dometion 1 

= 0 25 5 10 paca 0 25 5 10 
Low -0 Vs ew -0 ~~ ——_ | 


Kilometers Kilometers 


A.2 Population condition overview 


This condition encompasses the population demographic aspects of likely 
social media users. In essence, this condition accounts for the characteris- 
tics of persons (Age, Gender, Educational Level) that increases or de- 
creases the likelihood of them using social media. 


All factors currently have available metric and indicator data. Factors 
within this condition include Individual Characteristics and Education. 


A.2.1 Factor weighting logic 


Both the Individual Characteristics and Education factors are weighted a 
minimum and maximum of 0.5, as they are equally important in predict- 
ing the likelihood social media use. 


Weight Minimum Value Weight Maximum Value 


Individual Characteristics 
Education 0.5 0.5 
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A.3 


A.2.2 Uncertainty values 


This condition is given a range of uncertainty values from 0.1 to 0.3 be- 
cause of the perceived ability of the existing factors to fully address the 
themes of the condition. As more research on social media users, particu- 
larly social media/Twitter users in the Philippines, additional demo- 
graphic factors could further reduce the uncertainty. 


Condition Uncertainty Min: 


Condition Uncertainty Max: 


A.2.3 Example outcome 


Population Condition 


| 


Social Media Use Likelihood - Mean Standard Deviation 
High <1 


High : 4 
— 0 25 5 10 a ae 0 25 5 10 
Low :0 EE ———— Low :0 —— 


Individual characteristics factor overview 


The Individual Characteristics Factor focuses on immutable personal traits 
of individuals represented in the sample population. All indicators cur- 
rently have available metric data. 


Indicators within this factor include Sex & Gender, Age, and Disability. 
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A.3.1 Indicator weighting logic 


As an individual with a disability — particularly a vision impairment — 
would most likely preclude the likelihood of someone being a regular so- 
cial media user, the Disability indicator is given the highest maximum 
weight value. Age is more of a determinant than Sex/Gender, giving the 
Age indicator a higher minimum and maximum weight. 


Indicator Weight Minimum Value Weight Maximum Value 


Sex & Gender 
Age 
Disability 


A.1.1 Uncertainty values 


This factor is given a range of uncertainty values from 0.1 to 0.3 because of 
the perceived ability of the existing indicators to fully address the themes 
of the factor. As more research on social media users, particularly social 
media/Twitter users in the Philippines, additional demographic indicators 
could further reduce the uncertainty. 


Factor Uncertainty Min: 0.1 


Factor Uncertainty Max: 0.3 
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A.1.2 Example outcome 


Individual Characteristics Factor 


Social Media Use Likelihood - Mean Standard Deviation 


High = 1 fe ‘tion 1 
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A.1.3 Indicator: Sex & gender 


The Sex and Gender indicator accounts for the gender characteristics indi- 
viduals represented in the sample population. 


A.1.4 Metric weighting logic 


As both metrics within this indicator represent the same kind of data, they are 
weighted the same and given both a minimum and maximum value of 0.5. 


Weight Minimum Value Weight Maximum Value 


DHSSex 0.5 0.5 
IPUMSSex 0.5 0.5 


A.1.5 Uncertainty values 


This indicator is given an uncertainty minimum value of 0.1 and an uncer- 
tainty maximum value of 0.2 because of the perceived ability of the exist- 
ing metrics to fully address the themes of the indicator. 


Indicator Uncertainty Min: 0.1 


Indicator Uncertainty Max: 0.2 
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A.1.6 Example outcome 


Sex Gender Indicator 
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A.1.7 Metric: DHSSex 


Indicator: 


Sex & Gender 


Factor: 


Individual Characteristics 


Condition: 


Population 


Framework: 


Social Media User 


Metric Assigned 


Male (min: 0.5, max: 1.0) 
Female (min: 0.5, max: 0.9) 


DHS 


2013 


Third Team Media, 2017; Greenwood et al, 2016 


This metric consists of survey respondent gender identification from the 
DHS survey. Answers of “Male” are given a minimum value of 0.5 (slight 
risk, 21) and a maximum value of 1.0 (minimal risk with no impact on 
risk, 42°). Answers of “Female” are given a minimum value of 0.5 anda 
maximum value of 0.9 (minimal risk). 

According to Third Team Media, men are slightly more likely to be social 
media users than women (2017). In an American context, men and 
women used Twitter at equal percents (Greenwood et al, 2016). 


Maps will range from slight risk to no risk. 


Example Realization Metric Map Not Available 
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A.1.8 References 


Third Team Media. The State of Social Media and Digital in the Philippines for 2017. 
Jan 24, 2017. https://www.slideshare.net/likke13/the-state-of-social-media-and-digital-in- 
the-philippines-for-2017 on 8 Nov 2017. 


Greenwood, Shannon, Andrew Perrin and Maeve Duggan. Social Media Update 2016. 
Pew Research Center. Nov 11, 2016. http://www.pewinternet.org/2016/11/11/social- 
media-update-2016/. 


A.1.9 Metric: IPUMSSex 


Indicator: 
Factor: 
Condition: 
Framework: 


Metric Assigned 
Values: 


Survey: 
Survey Date: 


Other Data 
Sources: 


Logic: 


Sex & Gender 

Individual Characteristics 
Population 

Social Media User 


Male (min: 0.5, max: 1.0) 
Female (min: 0.5, max: 0.9) 


IPUMS 
2000 
Third Team Media, 2017; Greenwood et al, 2016 


This metric consists of survey respondent gender identification from 
the IPUMS survey. Answers of “Male” are given a minimum value of 
0.5 (slight risk, 421) and a maximum value of 1.0 (minimal risk with no 
impact on risk, 12°). Answers of “Female” are given a minimum value 
of 0.5 and a maximum value of 0.9 (minimal risk). 

According to Third Team Media, men are slightly more likely to be 
social media users than women (2017). In an American context, men 
and women used Twitter at equal percents (Greenwood et al, 2016). 


Maps will range from slight risk to no risk. 


Example Realization Metric Map 
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IPUMS Sex Metric 
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A.1.10 References 


Third Team Media. The State of Social Media and Digital in the Philippines for 2017. 
Jan 24, 2017. https://www.slideshare.net/likke13/the-state-of-social-media-and-digital-in- 


the-philippines-for-2017 on 


8 Nov 2017. 


Greenwood, Shannon, Andrew Perrin and Maeve Duggan. Social Media Update 2016. 
Pew Research Center. Nov 11, 2016. http://www.pewinternet.org/2016/11/11/social- 


media-update-2016/. 


A.1.11 Indicator: Age 


The Age indicator accounts for the age characteristics individuals repre- 
sented in the sample population. 


A.1.12 Metric weighting logic 


As there is only one metric for this indicator, it is given both a minimum 
and maximum value of 1.0. 


Metric 
IPUMSAge 


Weight Minimum Value Weight Maximum Value 
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A.1.13 Uncertainty values 


This indicator is given an uncertainty minimum value of 0.1 and an uncer- 
tainty maximum value of 0.2 because of the perceived ability of the exist- 
ing metrics to fully address the themes of the indicator. The addition of 
metrics measuring age groups of Twitter users specifically could further 
reduce the uncertainty values. 


A.1.14 Example outcome 


Age Indicator 


Social Media Use Likelihood - Mean Standard Deviation 
High :1 eee tiigh 1 

— 0 25 #5 10 peace. 08 o 25 #5 10 
Low -0 es =... -0 ee 


Kilometers ° Kilometers 


A.1.15 Metric: IPUMSAge 


Indicator: Age 
Factor: Individual Characteristics 
Condition: Population 


Framework: Social Media User 
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Metric Assigned 
Values: 


Survey: 
Survey Date: 
Other Data Sources: 


Logic: 


Oto12 (min: 0.001, max: 0.0625) 
13to17 (min: 0.25, max: 0.6) 
18to24 (min: 0.5, max: 0.9) 
25to34 (min: 0.5, max: 0.8) 
35to44 (min: 0.25, max: 0.6) 
45to54 (min: 0.125, max: 0.25) 
55to64 (min: 0.01, max: 0.125) 
65+ (min: 0.001, max: 0.0625) 


IPUMS 
2000 


See geet igor eS. 


Third Team Media, 2017 


This metric consists of respondent and household member age 
from the IPUMS survey. Answers of O to 12 years and 65+ years 
are given a minimum value of 0.001 (extreme risk) anda 
maximum value of 0.0625 (extreme risk, 1/24). Answers of 13 to 
17 and 35 to 44 are given a minimum value of 0.25 (medium 
risk, 122) and a maximum value of 0.6 (minimal risk). Answers of 
18 to 24 are given a minimum value of 0.5 (slight risk, 21) anda 
maximum value of 0.9 (minimal risk). Answers of 25 to 34 are 
given a minimum value of 0.5 and a maximum value of 0.8 
(minimal risk). Answers of 45 to 54 are given a minimum value of 
0.125 (high risk, 723) and a maximum value of 0.25. Answers of 
55 to 64 are given a minimum value of 0.01 (extreme risk) and a 
maximum value of 0.125. 


Third Team Media ranked Facebook users by age in the 
Philippines in 2017. These age rankings were used to evaluate 
the IPUMS survey answers. 


Maps will range from extreme risk to minimal risk. 


Example Realization Metric Map 
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IPUMS Age Metric 


Social Media Use Likelihood - Mean Standard Deviation 
High - 4 Der tigh: 1 

— 0 25 #5 10 pee 0.08 0 25 #5 10 
Low -0 ————— =. -0 ——— 


Kilometers Kilometers 


A.1.16 References 


Third Team Media. The State of Social Media and Digital in the Philippines for 2017. 
Jan 24, 2017. https://www.slideshare.net/likke13/the-state-of-social-media-and-digital-in- 
the-philippines-for-2017. 


A.1.17 Indicator: disability 


The Disability indicator accounts for whether or not individuals in the 
sample population have a disability that could potentially impact their use 
of social media. 


A.1.18 Metric weighting logic 


As the DisabilityStatus metric is more general, it is weighted lower than 
the VisionImpaired metric. 


Weight Minimum Value Weight Maximum Value 
6 0.8 


Vision|mpaired 0. 


DisabilityStatus 0.5 0.6 
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A.1.19 Uncertainty values 


This indicator is given an uncertainty minimum value of 0.1 and an uncer- 
tainty maximum value of 0.3 because of the perceived ability of the exist- 


ing metrics to fully address the themes of the indicator. 


Indicator Uncertainty Min: 


Indicator Uncertainty Max: 


A.1.20 Example outcome 


Disability Indicator 


Social Media Use Likelihood - Mean 


High - 4 
if o 25 5 10 
——_ 
Low -0 


Kilometers 


Standard Deviation 
Demme tiigh = 1 

-Mid: 0.04 

Low : 0 


0 


Kilometers 


25 Ey 10 


A.1.21 Metric: VisionImpaired 


Indicator: Disability 

Factor: Individual Characteristics 
Condition: Population 

Framework: Social Media User 


Metric Assigned Yes (min: 0.125, max: 0.25) 


Values: No (min: 0.25, max: 0.9) 


Unknown (min: 0.125, max: 0.9) 


Survey: IPUMS 
Survey Date: 2000 
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Other Data Sources: N/A 


Logic: This metric consists of answers to the question on whether or not 
the respondent has any kind of vision impairment and/or are blind 
from the IPUMS survey. Answers of “Yes” are given a minimum 
value of 0.125 (high risk, 423) and a maximum value of 0.25 
(medium risk, ¥22). Answers of “No” are given a minimum value of 
0.25 and a maximum value of 0.9 (minimal risk). Answers of 
“Unknown” are given a minimum value of 0.125 and a maximum 
value of 0.9. Maps will range from high risk to minimal risk. 


Example Realization Metric Map 


Vision Impaired Metric 


Social Media Use Likelihood - Mean Standard Deviation 


High -1 Der tigh 1 
if o 25 5 10 — o 25 5 10 
Low :0 ———— ‘Low : 0 be 


Kilometers Kilometers 


A.1.22 Metric: DisabilityStatus 


Indicator: Disability 

Factor: Individual Characteristics 

Condition: Population 

Framework: Social Media User 

Metric Assigned Yes (disabled) (min: 0.125, max: 0.5) 

Values: No (not disabled) (min: 0.25, max: 0.9) 
Unknown (min: 0.125, max: 0.9) 

Survey: IPUMS 

Survey Date: 2000 


Other Data Sources: N/A 
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Logic: This metric consists of answers to the question of whether the 
respondent has any kind of disability from the IPUMS survey. 
Answers of “Yes” are given a minimum value of 0.125 (high risk, 
43) and a maximum value of 0.25 (medium risk, 1/22). Answers of 
“No” are given a minimum value of 0.25 and a maximum value of 
0.9 (minimal risk). Answers of “Unknown” are given a minimum 
value of 0.125 and a maximum value of 0.9. Maps will range from 
high risk to minimal risk. 


Example Realization Metric Map 


Disability Status Metric 


Social Media Use Likelihood - Mean Standard Deviation 
High = 1 wee igh 1 

+f 0 25 Ss 10 amas 0 25 5 10 
Low -0 —— ew 0 hl —<_ | 


Kilometers Kilometers 


Education factor overview 


This factor concerns the educational backgrounds of the sample popula- 
tion. Indicators within this factor include Language Abilities, Literacy, and 
Education Attainment. All indicators currently have available metric data. 


A.2.1 Indicator weighting logic 


As literacy and higher educational attainment has been shown to be indi- 
cators of higher developments in ICT (information and communication 
technologies), which include social media platforms, these metrics are 
weighted higher than Language Abilities (Albert et al. 2016). 
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A.2.1.1 Uncertainty values 


Indicator Weight Minimum Value Weight Maximum Value 
Language Abilities 0.3 0.5 
Literacy 0.5 0.7 
Education Attainment 0.5 0.7 


This factor is given a range of uncertainty values from 0.1 to 0.2 because of 
the perceived ability of the existing indicators to fully address the themes 
of the factor. The addition of more educational-focused indicators could 
further reduce the uncertainty. 


Factor Uncertainty Min: 


Factor Uncertainty Max: 


A.2.1.2 Example outcome 


0.1 
0.2 


Education Factor 


Social Media Use Likelihood - Mean 


High : 1 
= r 
Low -0 
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A.2.1.3 References 


Albert, Jose Ramon G., Ramonette B. Serafica and Beverly T. Lumbera. Examining 
Trends in ICT Statistics: How Does the Philippines Fare in ICT? Philippines 
Institute for Development Studies. May 2016. 
https://dirp3.pids.gov.ph/websitecms/CDN/PUBLICATIONS/ pidsdps1616. pdf. 


A.2.2 Indicator: Language abilities 


The Language Abilities indicator accounts for different languages spoken 
by members of the sample population and represented in social media 
posts/tweets. As the English language is often overrepresented in study 
area tweets in terms of actual language skills and dialects of the population 
as a whole, it is currently the only metric in this indicator. 


A.2.2.1 Metric weighting logic 


As there is only one metric in this indicator, it is given both a minimum 
and maximum value of 1.0. 


Weight Minimum Value Weight Maximum Value 


A.2.2.2 Uncertainty values 


This indicator is given an uncertainty minimum value of 0.1 and an uncer- 
tainty maximum value of 0.2 because of the perceived ability of the exist- 
ing metrics to fully address the themes of the indicator. Additional metrics 
encompassing other language skills, such as local dialects, as well as met- 
rics that analyze what language is most represented in local tweets, could 
further reduce the uncertainty. 


Indicator Uncertainty Min: 0.1 


Indicator Uncertainty Max: 0.2 
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A.2.2.3 Example outcome 


Language Abilities Indicator 


* 


Social Media Use Likelihood - Mean Standard Deviation 
High ° 1 Wy cigh: 1 

— 0 25 5 10 paeoF 0 25 5 10 
Low -0 ——— = ww 0 ee 


Kilometers 7 Kilometers 


A.2.2.4 Metric: SoeaksEnglish 


Indicator: Language Abilities 

Factor: Education 

Condition: Population 

Framework: Social Media User 

Metric Assigned Yes (min: 0.75, max: 1.0) 

Values: No (min: 0.25, max: 0.5) 
Unknown (min: 0.25, max: 1.0) 

Survey: IPUMS 

Survey Date: 2000 

Other Data Andrei et al, 2016; Kroulek, 2017 


Sources: 
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Logic: 


This metric concerns whether or not the survey respondent speaks 
English from the IPUMS survey. Answers of “Yes” are given a minimum 
of 0.75 (minimal risk) and a maximum value of 1.0 (minimal risk with 
no impact on risk, 42°). Answers of “No” are given a minimum value of 
0.25 (medium risk, 422) and a maximum value of 0.5 (slight risk, 1/21). 
Answers of “Unknown” are given a minimum value of 0.25 and a 
maximum value of 1.0. 

According to Andrei et al. (2016), 80% of tweets from the Philippines 
are in English and 20% are in Filipino. English is one of two official 
languages in the Philippines (the other is Filipino), and over 92% of the 
population can speak it as a second language (Kroulek 2017). 

Maps will range from medium risk to no risk. 


Example Realization Metric Map 


Speaks English Metric 


_ - 
Low -0 


Social Media Use Likelihood - Mean Standard Deviation 


win 

-Mid: 0.04 
0 25 5 10 0 25 5 10 
‘Low -0 


Kilometers. Kilometers 


A.2.2.5 References 


Andrei, Amanda, Sara Beth Elson, Guido Zarrella. Language and Emotion in Philippine 
Twitter Use during Typhoon Haiyan. 2016. DOI 10.13140/RG.2.1.4671.5923. 


Kroulek, Alison. Which Counties Have the Most English Speakers? K International. 
27 Feb 2017. http://www.k-international.com/blog/countries-with-the-most-english-speakers/. 


A.2.3 Indicator: Literacy 


The Literacy indicator accounts for literacy levels of individuals in the 
sample population. The regular use of social media, by nature require the 
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skills to read and write. Literacy is identified in the ICT Development In- 
dex as an indicator on ICT capability or skills (Albert et al. 2016). There is 
currently only one metric within this indicator. 


A.2.3.1 Metric weighting logic 


As there is only one metric in this indicator, it is given both a minimum 
and maximum value of 1.0. 


Weight Minimum Value Weight Maximum Value 


A.2.3.2 Uncertainty values 


This indicator is given an uncertainty minimum value of 0.1 and an uncer- 
tainty maximum value of 0.2 because of the perceived ability of the exist- 
ing metrics to fully address the themes of the indicator. Additional metrics 
that represent a more detailed understanding of literacy levels (i.e., 4th 
grade level, college level, etc.) could further reduce uncertainty values. 


Indicator Uncertainty Min: 0.1 


Indicator Uncertainty Max: 0.2 
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A.2.3.3 Example outcome 


Literacy Indicator 


Social Media Use Likelihood - Mean Standard Deviation 


High : 1 ee tiigh 1 
— 0 25 5 40 peace n08 0 25 5 10 
Low -0 ee ew 0 Vs 


Kilometers Kilometers 


A.2.3.4 References 


Albert, Jose Ramon G., Ramonette B. Serafica and Beverly T. Lumbera. Examining 
Trends in ICT Statistics: How Does the Philippines Fare in ICT? Philippines 
Institute for Development Studies. May 2016. 
https://dirp3.pids.gov.ph/websitecms/CDN/PUBLICATIONS/pidsdps1616. pdf. 


A.2.3.5 Metric: Literacy 


Indicator: Literacy 

Factor: Education 
Condition: Population 
Framework: Social Media User 


Metric Assigned Yes (min: 0.125, max: 1.0) 


Values: No (min: 0.001, max: 0.0625) 

Survey: IPUMS 

Survey Date: 2000 

Other Data N/A 

Sources: 

Logic: This metric consists of answers to the question on whether or not the 


respondent is literate from the IPUMS survey. Answers of “Yes” are 
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given a minimum value of 0.125 (high risk, 12%) and a maximum value 
of 1.0 (minimal risk with no impact on risk, 72°). Answers of “No” are 
given a minimum value of 0.001 (extreme risk) and a maximum value 
of 0.0625 (extreme risk, 424). Maps will range from extreme risk to no 
risk. 


Example Realization Metric Map 


Literacy Metric 


Social Media Use Likelihood - Mean Standard Deviation 


High =1 ee 
-— ; : “Mid: 0.04 ae 
25 5 — SS Lu 
Low :0 er Low -0 es 


Kilometers Kilometers 


A.2.4 Indicator: Educational attainment 


The Educational Attainment indicator accounts for varying educational 
levels of individuals in the sample population. Secondary and tertiary 
school enrollment is identified in the ICT Development Index as an indica- 
tor on ICT capability or skills (Albert et al. 2016). 


A.2.4.1 Metric weighting logic 


As the EducationAttainment and HighestGradeCompleted have more pro- 
vide more specific measurements than School Attendance (which just asks 
if the respondent has attended school ever), they are weighted higher with 
both a minimum value of 0.4 and a maximum value of 0.6. 


Weight Minimum Value Weight Maximum Value 


EducationAttainment 
SchoolAttendance 0.2 0.4 


ERDC/CERL TR-19-14 57 


Weight Minimum Value Weight Maximum Value 


A.2.4.2 Uncertainty values 


This indicator is given an uncertainty minimum value of 0.1 and an uncer- 
tainty maximum value of 0.2 because of the perceived ability of the exist- 
ing metrics to fully address the themes of the indicator. Additional metrics 
measuring different educational achievement standards could further re- 
duce the uncertainty values. 


Indicator Uncertainty Min: 0.1 


Indicator Uncertainty Max: 0.2 


A.2.4.3 Example outcome 


Education Attainment Indicator 


Social Media Use Likelihood - Mean Standard Deviation 


High 1 Wy igh: 1 
— ete -Mid: 0.04 ; 
Low :0 — —— —— ew 0 


Kilometers Kilometers 


A.2.4.4 References 


Albert, Jose Ramon G., Ramonette B. Serafica and Beverly T. Lumbera. Examining 
Trends in ICT Statistics: How Does the Philippines Fare in ICT? Philippines 
Institute for Development Studies. May 2016. 
https://dirp3.pids.gov.ph/websitecms/CDN/PUBLICATIONS/ pidsdps1616. pdf. 
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A.2.4.5 Metric: EducationalAttainment 


Indicator: 
Factor: 
Condition: 
Framework: 


Metric Assigned 
Values: 


Survey: 
Survey Date: 


Other Data 
Sources: 


Educational Attainment 
Education 

Population 

Social Media User 


None or Preschool (min: 0.001, max: 0.0625) 
Grades 1 to 4 (min: 0.001, max: 0.0625) 
Grades 5 to 7 (min: 0.0625, max: 0.125) 

Some High School (min: 0.125, max: 0.75) 

High School Graduate (min: 0.25, max: 0.9) 
Training or Associate Degree (min: 0.5, max: 1.0) 
Some College (min: 0.5, max: 1.0) 

College Graduate (min: 0.5, max: 1.0) 

Graduate Education (min: 0.5, max: 1.0) 
Unknown (min: 0.0625, max: 1.0) 


IPUMS 
2000 
Albert et al, 2017; Greenwood et al, 2016 


Example Realization Metric Map 


Educational Attainment Metric 


— = 
Low :0 


Social Media Use Likelihood - Mean 


Standard Deviation 
1 
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A.2.4.6 References 


Albert, Jose Ramon G., Ramonette B. Serafica and Beverly T. Lumbera. Examining 
Trends in ICT Statistics: How Does the Philippines Fare in ICT? Philippines 
Institute for Development Studies. May 2016. 
https://dirp3.pids.gov.ph/websitecms/CDN/PUBLICATIONS/pidsdps1616. pdf. 


Greenwood, Shannon, Andrew Perrin and Maeve Duggan. Social Media Update 2016. 
Pew Research Center. Nov 11, 2016. http://www.pewinternet.org/2016/11/11/social- 
media-update-2016/. 


A.2.4.7 Metric: SchoolAttendance 


Indicator: 
Factor: 
Condition: 
Framework: 


Metric Assigned 
Values: 


Survey: 
Survey Date: 


Other Data 
Sources: 


Logic: 


Educational Attainment 
Education 

Population 

Social Media User 


Yes (min: 0.25, max: 0.9) 
No (min: 0.001, max: 0.0625) 


IPUMS 
2013 
N/A 


This metric consists of responses to the question “Ever attended 
school?” from the DHS survey. Answers of “Yes” are given a minimum 
value of 0.25 (medium risk, 422) and a maximum value of 0.9 (minimal 
risk). Answers of “No” are given a minimum value of 0.001 (extreme 
risk) and a maximum value of 0.0625 (extreme risk, 424). Maps will 
range from extreme risk to minimal risk. 


Example Realization Metric Map 
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School Attendance Metric 


— “ 
Low :0 


Social Media Use Likelihood - Mean 


Standard Deviation 


win 
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A.2.4.8 Metric: HighestGradeCompleted 


Indicator: 
Factor: 
Condition: 
Framework: 


Metric Assigned 
Values: 


Survey: 
Survey Date: 


Other Data 
Sources: 


Logic: 


Educational Attainment 
Education 

Population 

Social Media User 


NoEducation(min: 0.001, max: 0.0625) 
SomeElementary (min: 0.001, max: 0.0625) 
CompletedElementary (min: 0.0625, max: 0.125) 
Some High School (min: 0.125, max: 0.7) 
CompletedHighSchool (min: 0.5, max: 0.9) 
CollegeOrHigher (min: 0.5, max: 1.0) 


DHS 
2013 
N/A 


This metric consists of answers to the question of highest grade 
completed from the women’s portion of the DHS survey. Higher 
education levels correlate with a higher likelinood of social media use 
(Albert et al. 2017, Greenwood et al. 2016). Answers of “No 
Education” and “Some Elementary” are given a minimum value of 
0.001 (extreme risk) and a maximum value of 0.0625 (extreme risk, 
¥,4), Answers of “Completed Elementary” are given a minimum value of 
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0.0625 and a maximum value of 0.125 (high risk, 42°). Answers of 
“Some High School” are given a minimum value of 0.125 and a 
maximum value of 0.7 (minimal risk). Answers of “Completed High 
School” are given a minimum value of 0.5 (slight risk, 421) anda 
maximum value of 0.9 (minimal risk). Answers of “College or Higher” 
are given a minimum value of 0.5 and a maximum value of 1.0 
(minimal risk with no impact on risk, 72°). Maps will range from extreme 
risk to no risk. 


Example Realization Metric Map: Not Available 


A.2.4.9 References 


Greenwood, Shannon, Andrew Perrin and Maeve Duggan. Social Media Update 2016. 
Pew Research Center. Nov 11, 2016. http://www.pewinternet.org/2016/11/11/social- 
media-update-2016/. 


Technology condition overview 


This condition account for the access to/ownership of technology that ena- 
bles regular social media use. 


All factors currently have available metric and indicator data. 

Factors within this condition include Device Access and Network Access. 
Factor Weighting Logic 

As one needs both a device (i.e., computer, phone, etc.) and a network con- 


nection (Ethernet, Wi-Fi, etc.) in order to use social media, both factors are 
weighted equally with a minimum value of 0.6 and a maximum value of 0.8. 


Weight Minimum Value Weight Maximum Value 
0.6 


Device Access 


Network Access 0.6 


A.3.1 Uncertainty values 


This condition is given a range of uncertainty values from 0.2 to 0.4 be- 
cause of the perceived ability of the existing factors to fully address the 
themes of the condition. The addition of factors accounting for new tech- 
nological innovations, as well as the addition of qualitative data, could fur- 
ther reduce the uncertainty values. 


Condition Uncertainty Min: 0.2 
Condition Uncertainty Max: 0.4 
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A.3.2 Example outcome 


Technology Condition 


Social Media Use Likelihood - Mean Standard Deviation 
High - 4 ee 

— 0 25 5 10 his 0 25 5 10 
Low -0 es ew -0 SS eeeEe 


Kilometers Kilometers 


A.4 Device access factor overview 


This factor accounts for whether or not members of the sample population 
have access to devices which enable social media use. Indicators within 
this factor include Phone Access and Computer Access. All indicators cur- 
rently have available metric data. 


A.4.1 Indicator weighting logic 


As phones allow for more regular access of social media (being existentially 
more mobile than computers), the Phone Access indicator is given a slightly 
higher maximum weight value than the Computer Access indicator. 


Indicator Weight Minimum Value Weight Maximum Value 


Phone Access 


Computer Access 
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A.4.1.1 Uncertainty values 


This factor is given a range of uncertainty values from 0.1 to 0.3 because of 
the perceived ability of the existing indicators to fully address the themes 
of the factor. The addition of indicators accounting for other devices with 
which one could access social media, as well as more qualitative data on 
the condition of the devices, could further reduce the uncertainty values. 


Factor Uncertainty Min: 0.1 
Factor Uncertainty Max: 0.3 


A.4.1.2 Example outcome 


Device Access Factor 


& 


a Y me § 


Social Media Use Likelihood - Mean Standard Deviation 


High «1 immer tigh <1 
| 0 25 5 10 igen 0 25 5 10 
Low -0 as ew 0 —— 


A.4.2 Indicator: Phone access 


The Phone Access indicator accounts for ownership of and/or access to 
phones. Within the ICT Development Index, indicators on ICT infrastruc- 
ture and access include: fixed telephone subscriptions, mobile cellular tel- 
ephone subscriptions, international internet bandwidth per internet user, 
percentage of households with a computer, and percentage of households 
with internet access (Albert et al. 2016). 
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A.4.2.1 Metric weighting logic 


As cell phones are, by their nature, much more likely to have the ability to 
access social media (i.e., via a smartphone), the CellPhoneOwnership met- 
ric is weighted the highest. PhoneOwnership and TelephoneAvailability 
are weighted the same, as they essentially provide the same information 
but from different survey resources. 


Metric Weight Minimum Value Weight Maximum Value 


PhoneOwnership 


CellPhoneOwnership 


TelephoneAvailability 
A.4.2.2 Uncertainty values 


This indicator is given an uncertainty minimum value of 0.1 and an uncer- 
tainty maximum value of 0.2 because of the perceived ability of the exist- 
ing metrics to fully address the themes of the indicator. The addition of 
more qualitative data, as well as more specific metrics such as time spent 
on phone, could further decrease the uncertainty values. 


Indicator Uncertainty Min: 0.1 


Indicator Uncertainty Max: 0.2 


ERDC/CERL TR-19-14 


65 


A.4.2. 


3 Example outcome 


Phone Access Indicator 


Social Media Use Likelihood - Mean Standard Deviation 
High : 1 Way igh <1 
25 5 25 5 
ion’26 ——— ew -0 i — 


Kilometers Kilometers 


A.4.2. 


4 References 


Albert, Jose Ramon G., Ramonette B. Serafica and Beverly T. Lumbera. Examining 


Trends in ICT Statistics: How Does the Philippines Fare in ICT? Philippines 
Institute for Development Studies. May 2016. 
https://dirp3.pids.gov.ph/websitecms/CDN/PUBLICATIONS/ pidsdps1616. pdf. 


A.4.2.5 Metric: PhoneOwnership 


Indicator: Phone Access 

Factor: Device Access 

Condition: Technology 

Framework: Social Media User 

Metric Assigned Yes (min: 0.75, max: 1.0) 
Values: No (min: 0.0625, max: 0.5) 
Survey: DHS 

Survey Date: 2013 

Other Data N/A 

Sources: 

Logic: This metric consists of answers to the question whether the household 


owns a landline/wireless telephone from the DHS survey. Answers of 
“Yes” are given a minimum value of 0.75 (minimal risk) anda 
maximum value of 1.0 (minimal risk with no impact on risk, 12°). 
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Answers of “No” are given a minimum value of 0.0625 (extreme risk, 
¥24) and a maximum value of 0.5 (slight risk, 21). Maps will range from 
extreme risk to no risk. 


Example Realization Metric Map Not Available 


A.4.2.6 Metric: CellPhoneOwnership 


Indicator: 
Factor: 
Condition: 
Framework: 


Metric Assigned 
Values: 


Survey: 
Survey Date: 


Other Data 
Sources: 


Logic: 


Phone Access 
Device Access 
Technology 

Social Media User 


Yes (min: 0.75, max: 1.0) 
No (min: 0.0625, max: 0.5) 


DHS 
2013 
N/A 


This metric consists of answers to the question whether the household 
owns a mobile telephone from the DHS survey. Answers of “Yes” are 
given a minimum value of 0.75 (minimal risk) and a maximum value of 
1.0 (minimal risk with no impact on risk, 12°). Answers of “No” are given 
a minimum value of 0.0625 (extreme risk, 7/24) and a maximum value of 
0.5 (slight risk, 1/2). Maps will range from extreme risk to no risk. 


Example Realization Metric Map: Not Available 


A.4.2.7 Metric: TeleohoneAvailability 


Indicator: 
Factor: 
Condition: 
Framework: 


Metric Assigned 
Values: 


Survey: 
Survey Date: 


Other Data 
Sources: 


Logic: 


Phone Access 
Device Access 
Technology 

Social Media User 


Yes (min: 0.75, max: 1.0) 
No (min: 0.0625, max: 0.5) 


IPUMS 
2000 
N/A 


This metric consists of answers to the question whether the household 
has access to a telephone from the IPUMS survey. Answers of “Yes” 
are given a minimum value of 0.75 (minimal risk) and a maximum 
value of 1.0 (minimal risk with no impact on risk, 72°). Answers of “No” 
are given a minimum value of 0.0625 (extreme risk, 124) anda 
maximum value of 0.5 (slight risk, 42+). Maps will range from extreme 
risk to no risk. 
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Example Realization Metric Map 


Telephone Availability Metric 


Social Media Use Likelihood - Mean Standard Deviation 


High : 1 Wametiigh:1 
= 0 25 5 10 ject 08 0 25 5 10 
Low :0 ——— ew -0 el 


Kilometers Kilometers 


A.4.3 Indicator: Computer access 


The Computer Access indicator accounts for ownership of and/or access to 
computers. Within the ICT Development Index, indicators on ICT infra- 
structure and access include: fixed telephone subscriptions, mobile cellu- 
lar telephone subscriptions, international internet bandwidth per internet 
user, percentage of households with a computer, and percentage of house- 
holds with internet access (Albert et al. 2016). 


A.4.3.1 Metric weighting logic 


As there is only one metric in this indicator, it is given both a minimum 
and maximum value of 1.0. 


Weight Minimum Value Weight Maximum Value 
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A.4.3.2 Uncertainty values 


This indicator is given an uncertainty minimum value of 0.1 and an uncer- 
tainty maximum value of 0.2 because of the perceived ability of the exist- 
ing metrics to fully address the themes of the indicator. The addition of 
more qualitative data, as well as more specific metrics such as time spent 
on computers/tablets, could further decrease the uncertainty values. 


Indicator Uncertainty Min: 0.1 
Indicator Uncertainty Max: 0.2 


A.4.3.3 Example outcome 


Computer Access Indicator 


Social Media Use Likelihood - Mean Standard Deviation 
High <1 wee ish 1 
If 0 25 5 10 pesos 0 25 5 10 
Vs — SS <_< 4| 
Low :0 Kilometers ow 20 Kilometers 
Reference 


Albert, Jose Ramon G., Ramonette B. Serafica and Beverly T. Lumbera. Examining 
Trends in ICT Statistics: How Does the Philippines Fare in ICT? Philippines 
Institute for Development Studies. May 2016. 
https://dirp3.pids.gov.ph/websitecms/CDN/PUBLICATIONS/ pidsdps1616. pdf. 


A.4.3.4 Metric: PersonalComputerOwnership 


Indicator: Computer Access 


Factor: Device Access 
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Condition: Technology 

Framework: Social Media User 

Metric Assigned Yes (min: 0.75, max: 1.0) 

Values: No (min: 0.0625, max: 0.5) 

Survey: DHS 

Survey Date: 2013 

Other Data N/A 

Sources: 

Logic: This metric consists of answers to the question whether the household 


owns a personal computer or laptop from the DHS survey. Answers of 
“Yes” are given a minimum value of 0.75 (minimal risk) anda 
maximum value of 1.0 (minimal risk with no impact on risk, 12°). 
Answers of “No” are given a minimum value of 0.0625 (extreme risk, 
¥44) and a maximum value of 0.5 (slight risk, 12+). Maps will range from 
extreme risk to no risk. 


Example Realization Metric Map: Not Available 
A.5 Network access factor overview 


This factor accounts for whether or not members of the sample population 
have access to the internet, which enable social media use. Indicators 
within this factor include Internet Access. All indicators currently have 
available metric data. 


A.5.1 Indicator weighting logic 


As there is only one indicator in this factor, it is given both a minimum and 
maximum value of 1.0. 


Weight Minimum Value Weight Maximum Value 


A.5.1.1 Uncertainty values 


This factor is given a range of uncertainty values from 0.4 to 0.6 because of 
the perceived ability of the existing indicators to partially address the 
themes of the factor. The addition of indicators measuring cell phone net- 
work coverage, cell phone tower locations, available Wi-Fi locations, etc., 
could further decrease the uncertainty values. 


Factor Uncertainty Min: 0.4 
Factor Uncertainty Max: 0.6 
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A.5.1.2 Example outcome 


Network Access Factor 


Social Media Use Likelihood - Mean Standard Deviation 
High : 1 we cih af 

— 0 25 5 10 lenges 0 25 5 10 
Low -0 es ow 0 —— 


Kilometers . Kilometers 


A.5.2 Indicator: internet access 


The Internet Access indicator accounts for household access to the inter- 
net. Within the ICT Development Index, indicators on ICT infrastructure 
and access include: fixed telephone subscriptions, mobile cellular tele- 
phone subscriptions, international internet bandwidth per internet user, 
percentage of households with a computer, and percentage of households 
with internet access (Albert et al. 2016). Also, within the ICT Development 
Index, indicators on ICT intensity and usage include: individuals using the 
internet, fixed broadband subscriptions, and wireless broadband subscrip- 
tions (Albert et al. 2016). 


A.5.2.1 Metric weighting logic 


As there is only one metric in this indicator, it is given both a minimum 
and maximum value of 1.0. 


Metric Weight Minimum Value Weight Maximum Value 
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A.5.2.2 Uncertainty values 


This indicator is given an uncertainty minimum value of 0.1 and an uncer- 
tainty maximum value of 0.3 The addition of more qualitative data, as well 
as more specific metrics such as time spent online (and especially time 
spent on Twitter/other social media websites), could further decrease the 


uncertainty values. 
Indicator Uncertainty Min: 0.1 
Indicator Uncertainty Max: 0.3 


A.5.2.3 Example outcome 


Internet Access Indicator 


Social Media Use Likelihood - Mean Standard Deviation 


High : 1 eis 1 
—f 0 25 5 10 idsmeese 0 25 § 10 
Low -0 et —_ a 


Kilometers aoe Kilometers 


A.5.2.4 References 


Albert, Jose Ramon G., Ramonette B. Serafica and Beverly T. Lumbera. Examining 
Trends in ICT Statistics: How Does the Philippines Fare in ICT? Philippines 
Institute for Development Studies. May 2016. 
https://dirp3.pids.gov.ph/websitecms/CDN/PUBLICATIONS/ pidsdps1616. pdf. 


A.5.2.5 Metric: EmaillnternetCheck 


Indicator: Internet Access 
Factor: Network Access 


Condition: Technology 
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Framework: Social Media User 
Metric Assigned At least once a week (min: 0.75, max: 1.0) 
Values: Less than once a week (min: 0.5, max: 0.75) 


Not at all (min: 0.0625, max: 0.25) 


Survey: DHS 

Survey Date: 2013 

Other Data N/A 

Sources: 

Logic: This metric consists of answers to the question of how often the 


respondent checks email or surfs the internet from the DHS survey. 
Answers of “At least once a week” are given a minimum value of 0.75 
(minimal risk) and a maximum value of 1.0 (minimal risk with no 
impact on risk, 42°). Answers of “Less than once a week” are given a 
minimum value of 0.5 (slight risk, 21) and a maximum value of 0.75. 
Answers of “Not at all” are given a minimum value of 0.0625 (extreme 
risk, ¥2*) and a maximum value of 0.25 (medium risk, 1/22). Maps will 
range from extreme risk to no risk. 


Example Realization Metric Map Not Available 
A.4 Place condition overview 


This condition accounts for the characteristics of the home (or work) envi- 
ronment of individuals in the sample population that could contribute to 
regular social media use. Legara (2015) found that tweets in the Philip- 
pines correlated with population density, Gross Regional Domestic Prod- 
uct (GRDP), electricity, and road networks. 


All factors currently have available metric and indicator data. 


Factors within this condition include Household Characteristics and Ur- 
ban Characteristics. 


A.5.3 Factor weighting logic 


Geophysical Effects & Consequences and Climatological Effects & Conse- 
quences are both given a minimum weight of 0.4 and maximum weight of 
0.6, as they are equally important in predicting the condition opposed to 
Ecosystem Disruptions which has less impact on predicting cholera risk 
was given a minimum weight of 0.2 and maximum weight of 0.4. 


Weight Minimum Value Weight Maximum Value 


Household Characteristics 
Urban Characteristics 0.4 0.6 
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A.6 


A.5.4 Uncertainty values 


This condition is given a range of uncertainty values from 0.2 to 0.4 be- 
cause of the perceived ability of the existing factors to fully address the 
themes of the condition. The addition of factors accounting for qualitative 


data, could further reduce the uncertainty values. 


Condition Uncertainty Min: 0.2 
Condition Uncertainty Max: 0.4 


A.5.5 Example outcome 


Place Condition 


Social Media Use Likelihood - Mean Standard Deviation 
High : 4 wee cise 1 

—f se tales “ -Mid: 0.04 ; 
Low -0 ——— ew 0 


Kilometers 7 Kilometers 


25 5 10 


A.5.6 References 


Legara E. Urbanism in the Philippines. A Byte of my 22-lb Brain. 2015. 
https: //erikafille.pbh/2015/09/10/urbanism-in-the-philippines/. 


Household characteristics factor overview 


This factor covers household-level characteristics that could contribute to 
regular social media use. Indicators within this factor include Electricity 
and Wealth Index. All indicators currently have available metric data. 
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A.6.1 Indicator weighting logic 


As the Wealth Index indicator correlates more with regular social media 
use than electricity, it is given both a higher minimum and maximum 
weight value then the Electricity indicator. 


Weight Minimum Value Weight Maximum Value 
4 0.6 


Electricity 0. 
Wealth Index 0.7 0.9 


A.6.1.1 Uncertainty values 


This factor is given a range of uncertainty values from 0.2 to 0.4 because of 
the perceived ability of the existing indicators to fully address the themes 
of the factor. The addition of indicators accounting for other household in- 
dicators, such as number of people per room, zoning, etc., could further re- 
duce the uncertainty values. 


Factor Uncertainty Min: 0.2 
Factor Uncertainty Max: 0.4 


A.6.1.2 Example outcome 


Household Characteristics Factor 


Social Media Use Likelihood - Mean Standard Deviation 


High :1 weigh 1 
—f 0 25 #5 10 pee t.08 0 25 5 10 
Low :0 rr ey 0 =  — 


Kilometers Kilometers 
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A.6.2 Indicator: electricity 


The Electricity indicator accounts for whether or not a household has elec- 
tricity. Legara found that tweets in the Philippines correlated with popula- 
tion density, GRDP, electricity, and road networks (Legara 2015). 


A.6.2.1 Metric weighting logic 


As both metrics within this indicator represent the same kind of data, they are 
weighted the same and given both a minimum and maximum value of 0.5. 


Weight Minimum Value Weight Maximum Value 


A.6.2.2 Uncertainty values 


This indicator is given an uncertainty minimum value of 0.1 and an uncer- 
tainty maximum value of 0.2 because of the perceived ability of the exist- 
ing metrics to fully address the themes of the indicator. The addition of 
qualitative data accounting for quality of electrical services could further 
reduce the uncertainty values. 


Indicator Uncertainty Min: 0.1 


Indicator Uncertainty Max: 0.2 
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A.6.2.3 Example outcome 


Electricity Indicator 


—| - 
Low -0 


Social Media Use Likelihood - Mean Standard Deviation 


a 
-Mid: 0.04 
0 25 5 10 0 25 5 10 


Kilometers ears Kilometers 


A.6.2.4 References 


Legara E. Urbanism in the Philippines. A Byte of my 22-lb Brain. 2015. 
https://erikafille.ph/2015/09/10/urbanism-in-the-philippines/. 


A.6.2.5 Metric: ElectricityDHS 


Indicator: 
Factor: 
Condition: 
Framework: 


Metric Assigned 
Values: 


Survey: 
Survey Date: 


Other Data 
Sources: 


Logic: 


Electricity 

Household Characteristics 
Place 

Social Media User 


Yes (min: 0.5, max: 1.0) 
No (min: 0.0625, max: 0.5) 


DHS 
2013 
N/A 


This metric consists of answers to the question “Has electricity?” from 
the household portion of the DHS survey. Answers of “Yes” are givena 
minimum value of 0.5 (slight risk, 21) and a maximum value of 1.0 
(minimal risk with no impact on risk, 42°). Answers of “No” are given a 
minimum value of 0.0625 (extreme risk, 24) and a maximum value of 
0.5. Maps will range from extreme risk to no risk. 
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Example Realization Metric Map Not Available 


A.6.2.6 Metric: ElectricitylPUMS 


Indicator: Electricity 

Factor: Household Characteristics 
Condition: Place 

Framework: Social Media User 


Metric Assigned Yes (min: 0.5, max: 1.0) 


Values: No (min: 0.0625, max: 0.5) 

Survey: IPUMS 

Survey Date: 2000 

Other Data N/A 

Sources: 

Logic: This metric consists of answers to the question “Has electricity?” from 


the IPUMS survey. This metric consists of answers to the question “Has 
electricity?” from the household portion of the DHS survey. Answers of 
“Yes” are given a minimum value of 0.5 (slight risk, 21) anda 
maximum value of 1.0 (minimal risk with no impact on risk, 2°). 
Answers of “No” are given a minimum value of 0.0625 (extreme risk, 
¥54) and a maximum value of 0.5. Maps will range from extreme risk to 
no risk. 


Example Realization Metric Map 


Electricity IPUMS Metric 


Social Media Use Likelihood - Mean Standard Deviation 


High = 1 weigh 1 
| Oo 25. 5 10 — a 0 25 5 10 
Low -0 —— Low -0 —_—— 


Kilometers. Kilometers 
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A.6.3 Indicator: Wealth index 


The Wealth Index indicator accounts for socioeconomic status of the sam- 
ple population. Legara found that tweets in the Philippines correlated with 
population density, GRDP, electricity, and road networks (Legara 2015). 
Wealth Index here is acting as a surrogate for GRDP. 


A.6.3.1 Metric weighting logic 


As there is only one metric in this indicator, it is given both a minimum 
and maximum value of 1.0. 


Weight Minimum Value Weight Maximum Value 


A.6.3.2 Uncertainty values 


This indicator is given an uncertainty minimum value of 0.1 and an uncer- 
tainty maximum value of 0.2 because of the perceived ability of the exist- 
ing metrics to fully address the themes of the indicator. The addition of 
data that better represents GRDP, as well as other economic indicators, 
could further reduce the uncertainty values. 


Indicator Uncertainty Min: 0.1 


Indicator Uncertainty Max: 0.2 
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A.6.3.3 Example outcome 


Wealth Index Indicator 


Kilometers ° Kilometers 


Social Media Use Likelihood - Mean Standard Deviation 
High : 1 ween 21 
Low -0 —— ew 0 —— 


A.6.3.4 References 


Legara E. Urbanism in the Philippines. A Byte of my 22-lb Brain. 2015. 
https://erikafille.ph/2015/09/10/urbanism-in-the-philippines/. 


A.6.3.5 Metric: WealthQuintiles 


Indicator: Wealth Index 

Factor: Household Characteristics 
Condition: Place 

Framework: Social Media User 

Metric Assigned Lowest (min: 0.0625, max: 0.125) 
Values: Second (min: 0.125, max: 0.5) 


Middle (min: 0.25, max: 0.75) 
Fourth (min: 0.5, max: 1.0) 
Highest (min: 0.75, max: 1.0) 


Survey: DHS 
Survey Date: 2013 
Other Data N/A 


Sources: 
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Logic: This metric consists of categorizations of survey respondents based on 
household income from the DHS survey. Answers of “Lowest” are given 
a minimum value of 0.0625 (extreme risk, 42*) and a maximum value 
of 0.125 (high risk, 123). Answers of “Second” are given a minimum 
value of 0.125 and a maximum value of 0.5 (slight risk, 21). Answers 
of “Middle” are given a minimum value of 0.25 (medium risk, 422) and 
a maximum value of 0.75 (minimal risk). Answers of “Fourth” are given 
a minimum value of 0.5 and a maximum value of 1.0 (minimal risk 
with no impact on risk, 72°). Answers of “Highest” are given a minimum 
value of 0.75 and a maximum value of 1.0. Maps will range from 
extreme risk to no risk. 


Example Realization Metric Map: Not Available 
A.7 Urban characteristics factor overview 


This factor covers characteristics above the household level (neighbor- 
hood, city, region) that could contribute to regular social media use. Indi- 
cators within this factor include Road Network and Urban Proximity. All 
indicators currently have available metric data. 


A.7.1 Indicator weighting logic 


As both indicators contribute equally to the factor, they are both given a 
minimum weight value of 0.5 and a maximum weight value of 0.7. 


Weight Minimum Value | Weight Maximum Value 
5 0.7 


Road Network 0. 
Urban Proximity 0.5 0.7 


A.7.1.1 Uncertainty values 


This factor is given a range of uncertainty values from 0.2 to 0.4 because of 
the perceived ability of the existing indicators to fully address the themes of 
the factor. The addition of indicators accounting for urban infrastructure, as 
well as qualitative data, could further reduce the uncertainty values. 


Factor Uncertainty Min: 0.2 
Factor Uncertainty Max: 0.4 
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A.7.1.2 Example outcome 


Urban Characteristics Factor 


Social Media Use Likelihood - Mean 


High = 1 
~ a 
Low -0 ————— 


Kilometers ° Kilometers 


A.7.2 Indicator: Road network 


The Road Network indicator accounts for the proximity of members of the 
sample population to the road network. Legara found that tweets in the 
Philippines correlated with population density, GRDP, electricity, and 
road networks (Legara 2015). 


A.7.2.1 Metric weighting logic 


As proximity to major roads is a better indicator than proximity to roads in 
general, the DistanceToMajorRoad metric is given a higher minimum and 
maximum value than DistanceToRoad. 


Metric Weight Minimum Value Weight Maximum Value 
DistanceToRoad 0.5 0.7 
DistanceToMajorRoad 0.6 0.8 


ERDC/CERL TR-19-14 


82 


A.7.2.2 Uncertainty values 


This indicator is given an uncertainty minimum value of 0.1 and an uncer- 
tainty maximum value of 0.3 because of the perceived ability of the exist- 


ing metrics to fully address the themes of the indicator. The addition of 
transportation data could further reduce the uncertainty values. 


Indicator Uncertainty Min: 
Indicator Uncertainty Max: 


A.7.2.3 Example outcome 


0.1 
0.3 


Road Network Indicator 


Social Media Use Likelihood - Mean 


High «1 
} | 0 25 5 10 
Low -0 Vs 


Kilometers 


Standard Deviation 
High = 1 
-Mid: 0.04 
Low :0 


Kilometers 


A.7.2.4 Reference 


Legara E. Urbanism in the Philippines. A Byte of my 22-lb Brain. 2015. 
https: //erikafille.ph/2015/09/10/urbanism-in-the-philippines/. 


A.7.2.5 Metric: DistanceToRoad 


Indicator: Road Network 
Factor: Urban Characteristics 
Condition: Place 


Framework: Social Media User 
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Metric Assigned 
Values: 


Survey: 
Survey Date: 
Other Data Sources: 


Logic: 


10m (min: 1.0, max: 1.0) 

20m (min: 0.9, max: 0.9) 

50m (min: 0.75, max: 0.75) 

100m (min: 0.5, max: 0.5) 

200m (min: 0.25, max: 0.25) 

500m (min: 0.125, max: 0.125) 
>500m (min: 0.0625, max: 0.0625) 


N/A 
N/A 
Urban Tactical Planner, Army Geospatial Center 


This metric consists of distance to all roads in the metro Manila 
area. The metric map was created in ArcGIS using Multiple Ring 
Buffer, Polygon to Raster, and Project Raster tools. 

Distances of 10 meters or less are given both a minimum and 
maximum value of 1.0 (minimal risk with no impact on risk, 42°). 
Distances between 10 and 20 meters are given both a minimum 
and maximum value of 0.9 (minimal risk). Distances between 20 
and 50 meters are given both a minimum and maximum value of 
0.75 (minimal risk). Distances between 50 and 100 meters are 
given both a minimum and maximum value of 0.5 (slight risk, 21). 
Distances between 100 and 200 meters are given both a 
minimum and maximum value of 0.25 (medium risk, 1/22). 
Distances between 200 and 500 meters are given both a 
minimum and maximum value of 0.125 (high risk, 23). Distances 
greater than 500 meters are given both a minimum and maximum 
value of 0.0625 (extreme risk, 424). Maps will range from extreme 
risk to no risk. 


Example Realization Metric Map 
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Distance to Road Metric 


Social Media Use Likelihood - Mean Standard Deviation 
High = 1 


High <1 — 
if 0 25 5 10 peuck08 0 25 5 10 
Low -0 er ee wo —— 


Kilometers Kilometers 


A.7.2.6 Metric: DistanceToMajorRoad 


Indicator: Road Network 

Factor: Urban Characteristics 
Condition: Place 

Framework: Social Media User 
Metric Assigned 10m (min: 1.0, max: 1.0) 
Values: 20m (min: 0.9, max: 0.9) 


50m (min: 0.75, max: 0.75) 

100m (min: 0.5, max: 0.5) 

200m (min: 0.25, max: 0.25) 

500m (min: 0.125, max: 0.125) 
>500m (min: 0.0625, max: 0.0625) 


Survey: N/A 

Survey Date: N/A 

Other Data Urban Tactical Planner, Army Geospatial Center 

Sources: 

Logic: This metric consists of distance to major roads, such as freeways and 


other major thoroughfares. The metric map was created in ArcGIS 
using Multiple Ring Buffer, Polygon to Raster, and Project Raster tools. 
Distances of 10 meters or less are given both a minimum and 
maximum value of 1.0 (minimal risk with no impact on risk, 42°). 
Distances between 10 and 20 meters are given both a minimum and 
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maximum value of 0.9 (minimal risk). Distances between 20 and 50 
meters are given both a minimum and maximum value of 0.75 
(minimal risk). Distances between 50 and 100 meters are given both a 
minimum and maximum value of 0.5 (slight risk, 1/2). Distances 
between 100 and 200 meters are given both a minimum and 
maximum value of 0.25 (medium risk, 7/22). Distances between 200 
and 500 meters are given both a minimum and maximum value of 
0.125 (high risk, 123). Distances greater than 500 meters are given 
both a minimum and maximum value of 0.0625 (extreme risk, 1/24). 
Maps will range from extreme risk to no risk. 


Example Realization Metric Map 


Distance to Major Road Metric 


Social Media Use Likelihood - Mean Standard Deviation 
High : 1 


High = 1 — 
—f 0 25 5 10 haa 0 25 5 10 
Low -0 Vs ew -0 —— 


Kilometers Kilometers 


A.7.3 Indicator: Urban proximity 


The Urban Proximity indicator accounts for distance to dense urban areas, 
as well as proximity to other indicators of urban development. 


A.7.3.1 Metric weighting logic 


As there is only one metric in this indicator, it is given both a minimum 
and maximum value of 1.0. 


Metric Weight Minimum Value Weight Maximum Value 
PlaceOfResidence 1.0 1.0 
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A.7.3.2 Uncertainty values 


This indicator is given an uncertainty minimum value of 0.1 and an uncer- 
tainty maximum value of 0.3 because of the perceived ability of the exist- 
ing metrics to fully address the themes of the indicator. The addition of 
qualitative and land use data could further reduce the uncertainty values. 


Indicator Uncertainty Min: 0.1 


Indicator Uncertainty Max: 0.3 


A.7.3.3 Example outcome 


Urban Proximity Indicator 


Social Media Use Likelihood - Mean Standard Deviation 
High - 4 ei" 4 

+f 0 25 5 10 mic :08 0 25 #5 10 
Low -0 a el A -0 <4 


Kilometers Kilometers 


A.7.3.4 Metric: PlaceOfResidence 


Indicator: Urban Proximity 

Factor: Urban Characteristics 

Condition: Place 

Framework: Social Media User 

Metric Assigned City (min: 0.5, max: 0.9) 

Values: Town Proper/Poblacion (min: 0.25, max: 0.7) 


Barrio/Rural Area (min: 0.0625, max: 0.125) 
Abroad (min: 0.0, max: 0.0) 
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Don’t Know (min: 0.0, max: 0.0) 
Survey: DHS 
Survey Date: 2013 
Other Data Sources: N/A 


Logic: This metric consists of place of residence classifications from the 
DHS survey. Answers of “City” are given a minimum value of 0.5 
(slight risk, 21) and a maximum value of 0.9 (minimal risk). 
Answers of “Town Proper/Poblacion” are given a minimum value 
of 0.25 (medium risk, 722) and a maximum value of 0.7 (minimal 
risk). Answers of “Barrio/Rural Area” are given a minimum value 
of 0.0625 (extreme risk, 424) and a maximum value of 0.125 
(high risk, 723). Answers of “Abroad” and “Don’t Know” are given 
both a minimum and maximum value of 0.0. Maps will range 
from extreme risk to minimal risk. 


Example Realization Metric Map: Not Available 


A.7.3.5 References 


Albert, Jose Ramon G., Ramonette B. Serafica and Beverly T. Lumbera. Examining 
Trends in ICT Statistics: How Does the Philippines Fare in ICT? Philippines 
Institute for Development Studies. May 2016. 


https://dirp3.pids.gov.ph/websitecms/CDN/PUBLICATIONS/ pidsdps1616.pdf on 16 Nov 2017. 
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